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Selection System 

The present invemion relates to a screening system useful for screening repertoires of 
DNA binding domains. In panicular the invention relates to a screening system based on 
5 transcriptional activators of bacterial a54-dependeni promoters. 

The majority of proteins involved in cellular functions do so by interacting with other 
proteins or nucleic acid sequences within the cell. Several approaches have been described 
that allow the in vivo selection of nucleic acids which express polypeptides capable of 

10 binding to proteins or DNA in the cell. Arguably the most powerftil approaches are the 
yeast one- and two hybrid systems (Fields S. & Song O. (1989) Namre 340, 245; see US 
Patent 5,283,173, incorporated herein by reference in its entirety) for the screening of 
protcin-DNA and protein-protein interactions, respectively. However, the two-hybrid 
system requires an eukaryotic host and consequently the diversity that can be screened is 

15 limited. Furthermore the system notoriously suffers from an abundance of false positives. 

Larger molecular repertoires can be prepared in bacterial hosts and a number of bacterial 
systems for the screening of protein-protein and protein-DN A interactions have also been 
reported. Two systems have been put forward in which the polypeptide chain of an enzyme 
20 is expressed in two parts fused to two candidate polypeptides, and in which interaction 
between the candidate polypeptides reconstitutes the function of the enzyme (Karimova G. 
et al (1998) Proc. Nat Acad, Sci USA 95, 5752; Pelletier J.N. et al (1998) Proc. Nat, 
Acad Sci USA 95,12141). 

25 Several in vivo screens for DNA-binding proteins have also been reponed (reviewed in 
Mossing M.C., Bowie J.U, & Sauer R.T. (1991) Methods EniymoL 208, 604; Elledge S.J. 
et al (1989) Proc. Nat, Acad Sci USA 86, 3689). Each of these methods involves the 
blockage of a hybrid a70 promoter by the DNA binding protein. Repression of the 
promoter either prevents the production of conditionally toxic gene or alleviates repression 

30 of an antibiotic gene by transcriptional interference. The transcriptional interference assay 
(Elledge et al,) has been used successfully in one case to select DNA binding proteins with 
altered specificity (Sera T. & Schultz P.O. (1996) Proc. Nat. Acad Sci USA 93, 2920). 
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Another oTO-based system utilises recniitincm of the polymerase to the promoter by way 
of a protein-protein interaction between a protein domain fused to the RNA polymerase 
o subunit and another ftised to the lambda repressor bound immediately upstream of the 
RNA polymerase promoter binding site (Dove S.L.. Joung J.K.. Hochschild A. (1997), 
5 Naiure 386, 627). By replacing the lambda repressor DNA binding domain with a library 
of Zn-finger domains, specific DNA binding Zn-finger domains were selected (Joung J.K.. 
Ramm E.I.. Pabo CO. (2000) Proc Nail Acad Sci USA, 97. 7382) 

The alternative holoenzyme form of bacterial RNA polymerase (RNAP) contains the a 54 
10 factor (o 54-RNAP). As has been previously shown, this polymerase, in most cases, forms 
a closed complex with the promoter. Unlike o70 promoters at which the RNA polymerase 
is bound in an active form and is largely controlled by repression, the a54 RNA 
polymerase holoenzyme is transcriptionally incompetent and is unable to initiate 
transcription by itself. Initiation of transcription requires the presence of a transcriptional 
15 activator that catalyses tiie isomerisation of the closed promoter complex to an open one. 
Typically, activator proteins bind to a specific upstream activation sequence (UAS) located 
80 to 200 bp upstream of the a 54 core promoter. The function of the UAS is to tetiier the 
activator in the right position and to bring it in the vicinity of the promoter m order to 
increase the efficiency of interaction between die a 54 RNAP and die activator. 
20 Transcriptional activators of a54 dependent promoters have been called bacterial 
enhancers because their mechaiusm of activation is superficially similar to the activation of 
transcription by enhancer proteins in eukaiyotes (Kustu S. et al (1991) Trends Biochem Sci 
16. 397). 

25 Conversion of the a 54 RNAP into an active form is catalysed by tiie binding of an 
enhancer protein coupled to hydrolysis of ATP. This unusual mechanism accounts for the 
low level of background transcription and the enormous difference (10**-105) between on 
and off states in the strongest a54 promoters effected by a single factor. In comparison, 
activators of a70 promoters such as CAP or Xcl increase transcription levels usually by 

30 less than 1 0-fold. 
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Transcriptional aciivaiors of a54 promoters falso known as enhancer-binding proteins or 
EBPs) share a common stmcture (see Morrett and Segovia. (1993) J. Bacteriol. 6067- 
6074) comprising a non-conser\ed N-terminal domain which has a putative regulatory 
function, a central domain which is responsible for uanscripuonal activation, and a C- 
lerminal DNA binding domain which binds the relevant UAS in the target gene. The 
domains are modular: the central and N-terminal domains together are capable of 
constitutive activation of o54 RNAP when overexpressed. At least in some cases, the 
isolated DNA binding domain is capable ofspecifically binding its DNA recognition site. 

In many instances, interaction between c54 RNAP and tiie activator is enhanced by a 
cellular factor which promotes DNA bending between the UAS and tiie a54 promoter 
(Freundlich et al, (1992) Mol. Microbiol. 6:2557-2563). This factor, known as integration 
host factor (IHF) acts to promote transcription from o54 promoters. 

Summary of the Invention 

We provide herein a novel screening system which is based on transcriptional activators of 
o54-based promoters. 

According to a first aspect of the invention, therefore, tiierc is provided a meUiod for 
detecting a protein-nucleic acid interaction between a acid molecule and a protein 
molecule, comprising the steps of: 

a) providing one or more hybrid o54 activator proteins comprising a heterologous 
nucleic acid binding sequence and a constitutivcly active 054 transcription activating 
domain; 

b) providing one or more nucleic acid molecules comprising a binding site for the 
nucleic acid binding sequence and a binding site for o54 RNAP, which directs die 
expression of a reporter gene and leads to upregulation diereof in response to activation by 
the a54 transcription activating domain; and 

c) detecting expression of the reporter gene. 
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The invention provides a reporter system which is characterised by very low levels of 
background expression, since the o54 polymerase is iranscripiionally incompetent in the 
absence of a a54 transcriptional activator. Since, at physiological concentrations, the 
binding of the transcriptional activator to the nucleic acid is required in order to activate 
5 transcription by a54 RNAP. the system of the invention may be used as a tool for 
investigating and/or screening protein/nucleic acid interactions exploiting the reporter gene 
read-out. 

In the first aspect of the invention, either the nucleic acid binding protein or the nucleic 
10 acid molecule may be provided in the form of a repertoire of molecules. Repertoires of 

hybrid a54 activator proteins preferably are partially or completely randomised at least in 

J" 

the heterologous nucleic acid binding sequence. This allows selection from the library of 
molecules having desired nucleic acid binding characteristics. 

15 Repertoires of nucleic acid molecules advantageously are partially or completely 
randomised in the binding site for the nucleic acid binding sequence of the o54 activator 
protein. This allows selection of nucleic acid molecules having desired binding sites for 
the chimeric activators. 

20 In a second aspect of the invention, there is provided a system for selecting protein-protein 
interactions based on the constitutively active hybrid a54 activators described above. The 
system according to the invention is conceptually similar to the yeast two-hybrid system. 

Accordingly, there is provided a method for detecting a protein-protein interaction. 
25 comprising the steps of: 

a) providing a first hybrid protein comprising a nucleic acid binding sequence and a 
first polypeptide sequence bait; 

b) providing a second hybrid protein comprising a prey polypeptide sequence and 
constitutively active a34 transcription activating domain; 

30 c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 

binding sequence and binding site for o54 RNAP which directs the expression of a 
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reporter aene and leads lo upregulaiion ihereof in response lo activation by the a54 
iranscripiion activating domain: 

d) incubating the first and second hybrid proteins together with the nucleic acid 
molecule such that the prey and bait polypeptide sequences may bind, thereby forming a 

5 hybrid protein comprising both a nucleic acid binding sequence and a <j54 transcription 
activating domain; and 

e) detecting expression of the reporter gene. - 

As will be apparent to those skilled in the art, reference to a ^'binding site" for the nucleic 
10 acid binding sequence includes the provision of several appropriately spaced bbding sites 
in the nucleic acid molecule. 

As with the yeast two-hybrid system, in which a modular transcription factor is assembled 
though binding of DNA binding domain/bait and transcription activating domain/prey 

15 hybrids, the association of the nucleic acid binding sequence and the a54 transcription 
activating domain through the bait/prey interaction allows the detection of, and screening 
for, protein-protein binding interactions in vivo and in vitro. Advantageously, the bait 
and/or prey polypeptide sequences are provided in the form of repertoires, which may be 
partially or completely raridomised. This allows selection of prey polypeptides based on 

20 their ability to form interactions with a desired bait (or vice versa). As the assay may be 
conducted in vivo, in a bacterium, the invention permits the detection of in vivo binding, 
interactions between polypeptides in bacteria. 

It will be apparent that the hybrid proteins useful in the methods of the invention are 
25 advantageously provided in the form of nucleic acid vectors or libraries thereof capable of 
expressing said proteins in a host bacterium. Advantageously, the vector(s) include first 
and second chimeric genes which encode the hybrid proteins of the invention- Preferably, 
the vectors also include means for replication in bacteria. Also mcluded may be one or 
more marker genes, the expression of which in the bacterium permits selection of cells 
30 containing the vector(s) from cells that do not contain the vectors). Preferably, the 
vector(s) are plasinid(s). 
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In a third aspect, the Invention provides a method for screening a repenoire of 
candidate DNA-bending polypeptides, comprising the steps of: 

a) providing a repertoire of candidate polypeptide factors with potential to induce 

bending of DNA: 

5 b) providing a o54 activator protein comprising a nucleic acid binding sequence 

and a a54 transcription activating domain; 

c) providmg a nucleic acid molecule comprising a binding site for the nucleic acid 
binding sequence and binding site for o54 RNAP which directs the expression of a 
reporter gene and leads to upregulation thereof in response to activation by the a54 

10 transcription activating domain; 

d) incubating the repertoire and a54 activator together with the nucleic acid 
molecule in a HIF' host cell, such that o54 activator and the nucleic acid molecule may 
interact, and transcription activated from the o54 RNAP binding site in a manner 
dependent on DNA bending by the polypeptide factors; and 

15 e) detecting expression ofthc reporter gene. 

It is known that activation by o54 activators may be regulated by factors which induce 
DNA bending in the target gene. For example, the host factor IHF is known to potentiate 
o54 activation; moreover, it may be replaced by alternative DNA bending polypeptides, or 
20 by intrinsically bent DNA. 

The invention moreover provides methods for development of improved o54 activator- 
based tools. 

25 The first chimeric gene includes a nucleic acid sequence that encodes a nucleic-binding 
domain and a first (bait) test protein or protein fragment in such a manner that the first test 
protein is expressed as part of a hybrid protein with the nucleic acid-binding domain. 

The second chimeric gene also includes a promoter and a transcription termination signal 
30 to direct transcription. The second chimeric gene moreover includes a nucleic acid 
sequence that encodes a o54 transcriptional activation domain and a second (prey) test 
protein or protein fragment into the vector, in such a manner that the second test piotein is 
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capable of being expressed as part of a hybrid protein with the tnmscriptional activation 
domain. 

The invention moreover provides kits for practising the invention, which kits 
advantageously comprise a container, two vectors, and a host celL The first vector contains 
a promoter and may include a transcription termination signal functionally associated with 
the first chimeric gene in order to direct the transcription of the first chimeric gene. The 
chimeric gene advantageously comprises one or more unique restriction site(s) to insert a 
nucleic acid sequence encoding a test bait polypeptide. The kit also may also include a 
second vector which contains a second chimeric gene, optionally comprising one or more 
unique restriction site(s) to insert a nucleic acid sequence encoding the prey polypeptide; 
alternatively, the second chimeric gene may be present on the same vector as the first 
chimeric gene. 

Brief description of the Figures 

Figure 1 A is a schematic representation of a54 RNAP activation by a54 activator NifA. 

Figiu^ IB is a schematic representation of the invention, in which the a54 DNA binding 
domain is replaced with a heterologous GCN4 DNA binding domain- 
Figure 2 is a schematic representation of the first aspect of the present invention, in which 
a library of DNA binding domains is screened together with a library of DNA binding 
domain binding sites to identify protein:DNA binding pairs. 

Figure 3 shows the activation of transcription by NifA-chimera as expressed as percent of 
wt activity (NifA/UAS). Nif-GCN4 (in presence of the NifAAC coactivator (NifADC)) 
show close to wt activity. Equal activity is observed for the two distinct GCN$ DNA 
recognition sites (ATF/Creb and AP-l). Less than 1% wt activity is observed with a non- 
cognate reporter such as one bearing the wt nifH UAS. 
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Figure 4 shows ihe activation of transcription by NifA-chimera as expressed as percent of 
vM activity (NifA/UAS). Nif-ERDBD (in presence of the NifAAC coactivator (NifADC)) 
shows ca. 80% of wt activit>'. Ver>' Httle activation is observed with a non-cognate reporter 
bearing the DNA recognition site (ORE) for the closely related Glucocorticoid receptor. 

5 

Figure 5 shows the coaciivation by different NifA variants as expressed as percent of wi 
activity (NifAv^). NifA from Klebsiella pneumoniae (NifAKp) is superior to al others, 
even exceeding wt activity (up to 160%). NifAKp with its DNA domain deleted 
(NifAACKp (NIfADCKp)) is almost as active. 

10 

Detailed Description of the Invention 

Uidess defined otherwise, all technical and scientific terms used herein have the same 
1 5 meaning as commonly understood by one of ordinary skill in the ait (e.g., in bacterial cell 
culture, molecular genetics, nucleic acid chemistiy» protein chemistry and biochemistry). 
Standard techniques are used for molecular, genedc and biochemical methods (see 
generally, Sambrook et al. Molecular Cloning: A Laboratory Manual, 2d ed (1989) Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et aL, Short 
20 Protocols in Molecular Biology (1999) 4* Ed, John Wiley & Sons, Inc. which are 
incorporated herein by reference), chemical methods, pharmaceutical formulations and 
delivery and treatment of patients. n, 

A: Nucleic Acids and Proteins 

25 

As used herein, "nucleic acid" refers to any naniral nucleic acid, including RN A and DNA 
as well as synthetic nucleic acid comprising modified or synthetic bases, and mixtures of 
modified or synthetic bases with natural bases. Such modified and/or synthetic bases may 
be refened to as derivatives of DNA or RNA. Preferably, "nucleic acid" refers to DNA. 



30 
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The invention includes the use of modified and/or artificial ^•nucleic acids A number of 
modificaiions have been described thai alter the chemistry of the phosphodiester 
backbone, sugars or heterocyclic base components of nucleic acids. 

5 Among useful changes in the backbone chemisny are phosphoroihioates; 
phosphorodithioates. where both of the non-bridging oxygens are substituted with 
sulphur; phosphoroamidites; alkyi phosphotriesters and boranophosphates. Achiral 
phosphate derivatives include 3'-0-5*-S-phosphorothioaie. 3-S-5'-0-phosphorothioate, 
3'-CH2-5'-0-phosphonate and 3'-NHo'-0-phosphoroamidate. Peptide nucleic acids 

1 0 replace the entire phosphodiester backbone with a peptide linkage. 

Sugar modifications are also known. The a-anomer of deoxyribose may be used, where 
the base is inverted with respect to the natural p-anomer. The 2*-0H of the ribose sugar 
may be altered to form 2'-0-methyl or 2*-0-allyl sugars, which provides resistance to 
1 5 degradation without comprising affmity. 

Modification of the heterocyclic bases must maintain proper base pairing. Some useful 
substitutions include deoxyuridine for deoxythymidine; 5-methyl-2'-deoxycytidine and 
5-bromo-2'-deoxycytidinc for deoxycytidine. 5-propynyl-2'-deoxyuridine and 
20 5-propynyU2'-deoxycytidine have been shown maintain biological activity when 
substituted for deoxythymidine and deoxycytidine. respectively. 

As used herein, the term "protein" includes single-chain polypeptide molecules as well as 
multiple-polypeptide complexes where individual constituent pol3rpeptides are linked by 

25 covalent or non-covalent means. As used herein, the terms "polypeptide'* and -peptide" 
refer to a polymer in which the monomers are amino acids and are joined together through 
peptide or disulphide bonds. The term domain also refers to polypeptides and peptides 
having biological function. A peptide useful in the invention will have a bmding or 
transcription activating capability, i.e., with respect to binding to nucleic acids, other 

30 proteins or polypeptides, and activation of a54 RNAP transcription. It also may have 
another biological function that is a biological function of a protein or domain from which 
the peptide sequence is derived. 
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A hybrid protein is a protein or polypeptide which comprises constituent parts derived 
from at least two naturally-occurring or artificial proteins. In particular, it may comprise 
the DNA-binding domain of one protein and the protein-binding or transcription activating 
5 domain of a second protein. 

B: c54 Activators 

Activators of a54 transcription are well known and have been reviewed, for example, by 
10 Buck et aL J Bactcriol. 2000 Aug; 1 82(1 5) :4 129-36; Studholme and Buck. FEMS 

Microbiol Lett. 2000 May l;186(l):l-9; Shingler, Mol Microbiol. 1996 Feb;19(3):409-16; 

Goosen and van der Putte, Mol Microbiol. 1995 Apr, 16(1): 1-7; Merrick, Mol Microbiol. 

1993 Dec;10(5):903-9; and others. A family of such activator proteins has been defined, 

and its members found to share homology in the central (catalytic) domain which is 
1 5 responsible for a54 RN AP activation. 

Members of the family include the following (the numbers are GenBank accession 
numbers) 

20 dbjPAA16379. 1| (D90877) FORMATE HYDROGENLYASE TRANSCRIPTIONAL 
ACTIVATOR. 

emb|CAA26472.l| (X02616) pot. NifA gene product (aa 1-484) [Klebsiella pneumoniae] 

emb|CAA53584. 1| (X75972) anfA [Rhodobactcr capsulatus] 

emb|CAA92413.1| (Z68203) NifA homologue [Rhizobium sp.] 
25 cmb|CAA93242.1| (Z69251) MopR [Acinetobacter calcoaceticus) 

emb|CAB53 1 57. 1 1 (X07567) NifA I [Rhodobactcr capsulatus] 

emb|CAB56537.1| (AJ249642) response regulator [Pseudomonas stutzeri] 

gb|AAA58220.l| (U18997)ORF^o532 [Escherichia coli] 

gblAAA99303. 1 1 (L43064) regulatory protein [Pseudomonas aeruginosa] 
30 gb|AAB91397.1| (AF033203) NifAO protein [Rhodobactcr capsulatus J 

gb|AAC05586.l| (AF006075) regulatory protein [Bacillus subtilis] 

gb|AAC37l24.1| (L81 176) FleQ (Pseudomonas aeruginosa] 
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gb|AAC45640.l| (AFO 10585) putative sigma 54 activator [Caulobacier crescemus] 
gb|AAC46367.1| (AF0141 13) two-component response regulator [Vibrio choleraej 
gb|AAD3459l.l|AF1459S6_l (AF 145956) transcriptional activator NifA 
[Rhodospirillum rubrum) 

gb|AAD384 16.1 1 (AF 155934) NifA [Alcaligenes faecalis) 
8b|AAF28395.1| (AF069392) FlaM [Vibrio parahaemolyticus] 

gb|AAF33506.11 (AF 170 176) Salmonella typhimurium transcriptional regulatory protein 
gb|AAF6l932.1| (AF230804) sigma-54 activator protein Actl [Myxococcus xanthus] 
gb|AAF85342.1|AE004061_7 (AE004061) two-component system., regulatory protein 

10 [Xylellafastidiosa] 

gblAAF94676.1| (AE004230) sigma-54 dependent response regulator (Vibrio cholerael 
gblAAF95280.1| (AE004286) sigma-54 dependent response regulator [Vibrio cholerae] 
gb|AAF96095.1| {AE004358) sigma-54 dependent transcriptional regulator [Vibrio 
cholerae] 

15 gb|AAG0l527.1|AF288483_l (AF288483) NifA [Azospirillum brasilense] 
pirl|A48291 ornithine decarboxylase inhibitor - Escherichia coli 
pir||B49940 nitrogen regulator t homolog - Escherichia coli 
pir|lC70320 transcription regulator NifA family - Aquifex aeolicus 
pir||C70396 transcription regulator NtiC family - Aquifex aeolicus 

20 pir||C70454 transcription regulator NtrC family - Aquifex aeolicus 
pir|P703l5 transcription legulator NtrC family - Aquifex aeolicus 
pir||H69581 transcription activator of acetoin dehydrogenase operon acoR - Bacillus 
subtilis 

pirl|I397l9 nitrogen regulatory protein - Agrobacterium mmefaciens 

25 pir||JC547 1 regulatory protein NifA - Azospirillum lipoferum 

pir||T08624 probable NtrC-type response regulator - Eubacterium acidaminophilum 
sp|P03027|NIFA_KLEPNNIF-SPECIFIC REGULATORY PROTEIN 
sp|P09570|NIFA_A2OVINIF-SPEClFlC REGULATORY PROTEIN 
sp|P12627|VNFA_AZOVI NITROGEN FIXATION PROTEIN VNFA 

30 splP14375|HYDG_ECOU TRANSCRIPTIONAL REGULATORY PROTEIN HYDG 
sp|P21712|YFHA_ECOLI HYPOTHETICAL 49.1 KD PROTEIN IN GLNB-PURL 
INTERGENIC REGION (ORFXB) 
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sp|P24426|NIFA_RHILTNIF-SPECIFIC REGULATORY PROTEIN 
sp|P25852|HYDG_SALTY TRANSCRIPTIONAL REGULATORY PROTEIN HYDG 
sp|P277l31NIFA_HERSE NIF-SPECIFIC REGULATORY PROTEIN 
sp|P30667|NIFA>ZOBRNIF-SPECIFIC REGULATORY PROTEIN 

5 sp|P38035|RTCR_ECOLI TRANSCRIPTIONAL REGULATORY PROTEIN RTCR 
sp|P54929iNIFA_AZOLI NIF-SPECIFIC REGULATORY PROTEIN 
splP56266|NIFA^KLEOX NIF-SPECIFIC REGULATORY PROTEIN 
sp|Q06063|ATOC_ECOLI ACETOACETATE METABOLISM REGULATORY 
PROTEIN ATOC (ORNITHINE/ARGININE 

10 sp|Q46802|YGEV_ECOU HYPOTHETICAL SIGMAo4.DEPENDENT 
TRANSCRIPTIONAL REGULATOR IN 

sp|Q53206|NlFA_RHISN NIF-SPECIFIC REGULATORY PROTEIN 
sp|Q9ZIB7iTYRR_ERWHE TRANSCRIPTIONAL REGULATORY PROTEIN TYRR 

15 Moreover, a number of polypeptides belonging to the a54 activator famly have been 
described whose 3D structures are known. These include: 113161 acetoin cataboUsm 
regulatory protein; 1 13629 alginate biosynthesis transcriptional regulatory protein ALGB; 
266789 type 4 fimbriae expression regulatory protein PILR; 113833 nitrogen fixation 
protein ANFA; 138884 nitrogen fixation protein VNFA; 128219 nif-spccific regulatory 

20 protein; 3024194 nif-specific regulatory protein; acetoacetate metabolism regulatory 
protein ATOC 1168533 (omithine/arginine decarboxylase inhibitor) (ornithine 
decarboxylase antizyme); 417166 transcriptional regulatory protein HYGD; 266622 nif- 
specific regulatory protein; 1352500 nif-specific regulatory protein; 128224 nif-specific 
regulatory protein; 128225 nif-specific regulatory protein; 128221 nif-specific regulatory 

25 protein; 128226 nif-specific regulatory protein; 1346014 transcriptional regulatory protein 
FLBD; 549560 hypothetical sigma-54-dependent transcriptional regulator in GUTQ-HYPF 
intergenic region; 139857 transcriptional regulatoiy protein XYLR (67 kd protein); 120053 
formate hydrogenlyase transcriptional activator; 2507375 hypothetical 49.1 kd protein in 
GLNB-PURl intergenic region (ORFXB) (orf-2); 134961 signal-transduction and 

30 transcriptional-control protein; 1 171795 nitrogen assimilation regulatory protein; 417388 
nitrogen regulation protein nr<i); 123466 hydrogenase transcriptional regulatory protein 
HOXA; 399925 hydrogenase transcriptional regulatory protein HOXA; 585586 nitrogen 
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assimilation regulatory protein NTRX; 118399 c4-dicarboxylaie transport transcriptional 
regulatory protein DCTD: 585267 pathogenicity locus probable regulatory protein HRPR; 
1346313 pathogenicity locus probable regulatory protein HRPS; 549447 pathogenicity 
locus probable regulatory protein WTSA: 585909 arginine utilization regulatory protein 
5 ROCR: 136600 transcriptional regulators' protein TYRR; 1174836 transcriptional 
regulatoiy protein TYRR homolog: 123748 hydrogenase transcriptional regulatory protein 
HUPRl: 128604 nitrogen regulation protein NTRC; 1169293 glycerol metabolism operon 
regulatory protein; and 129957 phosphoglycerate transport system transcriptional 
regulatory protein PGTA . The numbers are GenBank gi numbers. 

10 

Preferably, the hybrid o54 activator is based on the NifA activator. The Nif family of 
bacterial enhancers regulate expression of nitrogenase components from c54 promoters in 
nitrogen-fixing bacteria, and arc inhibited by NifL (Austin S. et al (1994) 7, Bacteriol 176, 
3460). In bacteria lacking NifL, NifA is constitutively active. NifA is modular in 
15 architecture and it is shown herein that this allows for the swapping of the natural DNA- 
binding domain (DBD) for heterologous DBDs. Such Ni£A-DBD chimaeras are inactive 
on the wild type promoter^ but activate transcription from hybrid promoters bearing their 
cognate target sequences. 

20 Advantageously, the hybrid o54 activator may be based on £ coli PspF (see Jovanovic et 
qL (1996) J. Bacteriol. 178:1936-1945). PspF lacks the N-terrainal regulatory domain 
typical of a54 activators, and is constitutively active but negatively regulated by PspA, 
Thus, in bacteria lacking PspA, PspF is constimtively active. 

25 Other a54 activators may be rendered constitutively active by removal of the N-terminal 
regulatory domain or by appropriate mutation. 

Nucleic acid binding sequences or domains are known in the art and may be derived from 
a54 activator proteins or any other DNA binding proteins, whether naturally-occurring or 
30 synthetic; Moreover, DNA-binding domains may be synthesised by partial or complete 
randomisation. Many naturally-occurring DNA-binding proteins contain independently 
folded domains for the recognition of DNA. and these domains in turn belong to a large 
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number of stniciural families, such as ihe leucine zipper, the homeodomain, the "helix- 
rum-helix'*, the zinc finger and various other uxmscription factor families. 

C: Libraries 

5 

The term library refers to a mixture of heterogeneous polypeptides or nucleic acids. The 
library is composed of members, which have a unique polypeptide or nucleic acid 
sequence. To this extent, library is synonymous with repertoire^ although in general the 
term "library" is used herein to denote the source of the repertoire - e.g. a library of 

10 nucleic acid molecules which encodes a repertoire of polypeptides. Sequence difTerences 
between Ubrar\' members are responsible tor the diversity present in the library. The 
library may take the form of a simple mixture of polypeptides or nucleic acids, or may be 
in the form organisms or cells, for example bacteria, viruses, animal or plant cells and the 
like, transformed with a library of nucleic acids. Advantageously, the nucleic acids are 

15 incorporated into expression vectors, in order to allow expression of the polypeptides 
encoded by the nucleic acids. In a preferred aspect, therefore, a library may take the form 
of a population of host organisms, each organism containing one or more copies of an 
expression vector containing a single member of the library in nucleic acid form which 
can be expressed to produce its corresponding polypeptide member. Thus, the population 

20 of host organisms has the potential to encode a large repertoire of genetically diverse 
polypeptide variants. 

Libraries of hybrid proteins may be prepared and selected together with libraries of hybrid 
nucleic acids. "Crossing" of hybrid libraries is performed by combinatorial infection, 
25 which has been employed successfully to generate very large antibody libraries (Griffiths 
et al (1994) EMBO J. 1 3, 3245). 

Although libraries for use in the present invention may be phage libraries, as is known in 
the art, it is possible to use alternative libraries which are constructed using other vectors, 
30 such as plasmids. In any case, the present invention does not require the library to be 
cs^able of ""display" of the gene product at the bacterial surface, as with phage libraries; 
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rather, the gene product is preferable expressed iniracellularly. and is advantageously not 
expressed as a fusion with a vector gene product. 

DNA binding domain libraries are preferably based on a known DNA binding domain 
5 architectures (e.g. basic leucine zipper, bZIP) and may be derived using PGR 
amplification with "family-specific" primers. Such libraries may be crossed with hybrid- 
promoters bearing defined target sequences or libraries of target sequences. In addition to 
providing information on the distribution of members of the family in a given genome, 
such libraries may be used to identify and study proteins or molecular compounds that 
10 modify DNA interaction within a family of DNA binding domains, for example Tax 
(from HTLV- 1 ) in the case of bZIP proteins. 

In an alternative embodiment, they may also be used to select DNA binding domains 
which conditionally bind their target sequence only in the presence of other factors such 

15 as protein cofactors or small molecular compounds, for example drugs that intercalate 
into DNA or alter the degree of supercoiling or recognise DNA sequences vMch have 
been modified chemically (e.g. methylated). The system can also be used "in reverse" i.e. 
to select proteins or molecular compounds that dismpt a particular DNA-protcin 
Interaction or to select DNA binding domains that do not bind a particular target sequence 

20 or library thereof. 

More advanced libraries are preferably derived directly from genomic DNA or cDNA 
libraries and selected on hybrid promoters bearing a repertoire of target sequences, 
comprising cither a stretch of randomised sequence or a library of inserts derived from 
25 fragmented genomic DNA. Data obtained in this way allows the compilation of a genomic 
directory of DNA binding domains and the building of a promoter-DNA binding domain 
interaction map. 

D: Hybrid polypeptides 

30 

The generation of hybrid pol>peptides by domain fusion is well known in the art and may 
be effected by fusing polypeptides or, preferably, by ftising nucleic acids which encode 
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the polypeptides. It has been known since 1976 that DNA binding and transcriptional 
activator domains are separable, and can be swapped between proteins; see Ma and 
Ptashne, who reported (Cell, (1987) 51, 1 13-119; Cell. (1988)55, 443-446) thai when both 
the GA14 N-terminal domain and C-terminal domain are fused together in the same 

5 protein, transcriptional activity is induced. Other proteins are also known function as 
transcriptional activators via the same mechanism. For example, the GCN4 protein of 
Saccharomyces cerevisiae as reported by Hope and Struhl. Cell. 46, 885-894 (1986), the 
ADRl protein of Saccharomyces cerevisiae as reported by Thukral et al., Molecular and 
Cellular Biology, 9, 2360-2369, (1989) and the human estrogen receptor, as discussed by 

10 Kumar et al., CelL 51, 941-951 (1987) both contain separable domains for DNA binding 
and for maximal transcriptional activation. 

The same is specifically known of the a54 bacterial uranscriptional activators, aldiougb a 
genetic screen based thereon has not been proposed. Therefore- the present invention may 
15 be carried out using techniques which are known to those skilled in the art, particularly as 
applied to 2-hybrid techniques in eukaryotic cells. 

Synthesis of chimeric genes for the purposes of the present invention may be canried out 
by any desired means, including polynucleotide synthesis and mutagenesis approaches. 

20 For example, a number of mediods for site-directed mutagenesis are known in the ait, 
from methods employing single-stranded phage such as Ml 3 to PCR-based techniques 
(see "PCR Protocols: A guide to methods and applications", M.A. Innls, D.H- Gelfand, 
J.J. Sninsky, TJ. White (eds.). Academic Press, New York, 1990). Preferably, the 
commercially available Altered Site II Mutagenesis System (Promega) may be employed, 

25 according to the directions given by the manufacturer. 

E: Host Cells 

Host cells useful in conjunction with the present invention are prokaryoiic cells, 
30 advantageously bacterial cells. £ coli is the prefenred host; however, host cells may 
belong to any species or genus in which oS4 RNAP-driven transcription is possible, such 
as Klebsiella. Rhodobacter, Rhizobium, Acinetobacter. Pseudomonas. Escherichia, 
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Bacillus, Caulobacter, Vibrio, Rhodospirillum, Alcaligenes, Salmonella. Myxococcus, 
Xylella, Aiospirilliim, Aquifex, Agrobaaerium and other organisms. In Exoli, the 
preferred configuration is a modified strain, in which a truncated form of Nif (or another 
activator) is cocxpressed to boost specific activation (see Methods). 

5 

Preferably, the host cells lack repressors of the o54 activator being used, such that the 
transcription activating domain is constitutively active. Repressors may be deleted by 
genetic mutation and/or selection, or inhibited by expression of antisense constructs, or 
the like. In general, due to the accessibility of bacterial genetics, especially in £ coliy 
10 deletion of repressor genes is straightforward to those skilled in the an. 

F: Reporter Genes 

Reporter genes of various types are known in the art and may be used in conjunction with 
15 the present invention. A "reporter gene", as referred to herein, may be the coding 
sequence which encodes a detectable gene product, or the coding sequence including the 
necessary control sequences for its expression in accordance with the invention, as 
. appropriate. 

20 Advantageously, the reporter gene is selected from the group consisting of metabolic 
markers such as the lac operon (lacZ, lacY and lacA); proteins conferring a fluorescent 
phenotype, such as GFP; proteins conferring antibiotic resistance, such as Zeo; and 
proteins conferring another selectable property. 

25 Certain reporters, such as the LacZ gene, are widely used in bacterial genetics and are 
useful in the performance of the inventioiL However, other genes may also be employed, 
including fluorescent proteins. For example* green fluorescent proteins (GFPs) of 
cnidarians, which act as their energy-transfer acceptors in bioluminescence, can be used in 
the invention. A green fluorescent protein, as used herein, is a protein that fluoresces 

30 green light, and a blue fluorescent protein is a protein that fluoresces blue light GFPs 
have been isolated from the Pacific Northwest jellyfish, Aequorea victoria^ from the sea 
pansy, Renilla reniformis; and firom Phialidium gregarium. (Ward et al.. 1982, 
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Phntochem. Photobiol .. 35: 803-808: Levine et al., 1982. Comp. Biochem. Phvsiol.,72B: 
77.85). 

A variety of /legMorea-relatcd GFPs having useful excitation and emission spectra have 
been engineered by modifying the amino acid sequence of a naturally-occurring GFP from 
Aequorea victoria. (Prasher et al., 1992. Gene. Ill: 229-233; Heim et al., 1994, Proc, 
Nad. Acad. Sci. U.S.A.. 91: 12501-12504; PCT/US95/14692). As used herein, a 
fluorescent protein is an Aequorea-relatcd fluorescent protein if any contiguous sequence 
of 150 amino acids of the fluorescent protein has at least 85% sequence identity with an 
amino acid sequence, either contiguous or non-contiguous, from the wild-type Aequorea 
green fluorescent protein (SwissProt Accession No. P42212). Similarly, the fluorescent 
protein may be related to Renilla or Phialidium v/ild-type fluorescent proteins using the 
same standards. 

Aequorea-tQlBicd fluorescent proteins include, for example, wild-type (native) Aequorea 
victoria GFP, whose nucleotide and deduced amino acid sequences are presented in 
GenBank Accession Nos. L29345. M62654, M62653 and others Aequorea-xclaXtd 
engineered versions of Green Fluorescent Protein, of which some are listed above. Several 
of these, i.e., P4, P4.3, W7 and W2 fluoresce at a distinctly shorter v^^velength than wild 
type. 

A specific advantage of fluorescent proteins is that they facilitate FACS sorting of cells in 
a manner dependent on reporter gene expression O^orman, S.O. (1980). Flow cytometry. 
Med Phys. 7, 609-615; Mackenzie, N.M. & Pinder, A.C. (1986). The application of flow 
raicrofluorimeuy to biomedical research and diagnosis: a review. Dev. BioL Stand 64, 
181-193). 

Other reporter genes may complement auxotrophic mutations, confer antibiotic resistance 
or other selectable characteristics to the host bacteria. Reporter genes may be v^rholly or 
partly heterologous to the host cell, and introduced by mutagenesis and/or transformation 
witfi appropriate vectors. Alternatively, endogenous a54-responsive genes may be used 
as reporter genes. 
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The reponer gene also contains a binding site for a54 RNAP. The consensus sequence 
for o54 RNAP binding is 5* TGGCAC-N3-TTGCa/t 3'. This sequence is located at -12 to 
-24 with respect to the start of transcription, whilst the more common sigma 70 
3 recognition sequence is situated at -10 to/ -35. Both the GG & GC must be on the same 
face of the DNA helix. 

In order to increase specificity, combinations of two or more reporter genes (preferably in 
tandem) may be used. 

10 

Where the reporter gene is chimeric, i.e. comprises heterologous binding sites for the 
nucleic acid binding sequence and a54 IINAP binding sites incorporated into the same 
nucleic acid, the spacing between the aS4 RNAP binding site and the nucleic acid binding 
sequence binding site is preferably conserved with respect to the natural gene from ^^ch 
IS the aS4 RNAP binding site is taken. Advantageously, the spacing is at least calculated 
such that the spatial relationship of the elements on respective faces of the nucleic acid 
helix is maintained. 

Reporter genes advantageously comprise a binding site for a further activation factor, such 
20 as IHF. These factors are believed to induce bending of the DNA, thus potentiating 
activation of aS4 RNAP-driven transcription by aS4 activators. Alternatively, the DNA 
itself may be intrinsically bent, thus providing constitutive potentiation of a34-specific 
activation. 

23 G: Configurations of the Invention 

The present invention may be configured in three basic ways: a first configuration, in 
which reporter gene activation is dependent on the interaction between the nucleic acid 
and a nucleic acid binding domain on the hybrid protein; a second configuration^ in which 
30 reporter gene activation is dependent on interaction between bait and prey polypeptides 
which serves to bring together two or more components of the hybrid protein; and a third 
configuration, in which reporter gene activation by a a34 activator is dependent on the 
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presence of DNA-bending polypeptides. As referred to herein, an interaction is 
advantageously a binding interaction. 

Where the invention is configured to detect protein-nucleic acid interaction, libraries of 
5 proteins and/or nucleic acids may be prepared as described above. Proteins having 
iniproved nucleic acid binding, or nucleic acid sequences having improved affinity for 
protein domains, may be developed by mtitagenesis and selection of candidate sequences. 
Alternatively, protein and/or nucleic acid sequences may be used to identify, in Wvo, 
cognate binding parmers. 

10 

Chimeric c54 activators also otTer the oppormnity to better understand aspects of the 
process of transcriptional activation at aS4 promoters. In the case of Ni£A, t is known that 
binding of the target sequence together with ATP binding promotes oligomerisation of 
NifA. It is believed that it is the oligomer which contacts the polymerase and catalyses the 

15 ATP-driven isomerisation of the polymerase holoenzyme. Taking advantage of the 
superactivation effect described above it may be possible to address questions such as 
vAach components of the oligomer (e.g. the DNA-bound NifA vs. NifAAC, i.e. NifA with 
the DNA binding domain removed), which are contacting the polymerase and/or coupling 
ATP hydrolysis to transcriptional activation etc. Furthermore, usage of NifAAC co&ctors 

20 from different species (together with their diversification by PGR shuffling) allows 
identification of the sequence regions critical for transcriptional activation and a 
''maturation" of the NifAAC coactivator. Indeed, we have found the NifA from IC 
pneumoniae to be a superior cofactor to A. vinelandii NifA. Finally, it may be possible to 
use chimera of a known DNA binding domains (e.g. GCN4) and a cDNA library as a 

25 prokaryotic "enhancer" trap, to isolate o54 activators on a genome-wide scale. 

Configuration of the invention to detect protein-protein interactions follows the general 
scheme of the yeast two-hybrid assay, and the reagents used in the invention may be set 
up accordingly. In general, therefore, the invention will comprise a nucleic acid binding 
30 domain-bait fusion, and a prey-054 activator domain fusion. Although, in general, "baif* 
refers to a known polypeptide and "prey** to an unknown polypeptide, the terms may be 



wo 01/18244 



PCT/GBOO/03450 



21 

used interchangeably. Indeed, the invention comprises configurations in which both bait 
and prey are known, or both arc unknown. 

Binding between the bait and the prey result in constitution of a hybrid protein which 
5 comprises both a nucleic acid binding domain and a a54 activator domain. The hybrid 
protein is able to activate transcription from a reporter gene, thus providing a baitrprey 
binding-dependent signal. 

Protein-protein interactions may be selected using the preferred NifA system, in which the 
10 hybrid a54 transcriptional activator includes the NifA activation domain. The NifA 
bacterial two-hybrid system may be used for the generation of interaction matrices between 
cDNA libraries. Ultimately such interaction matrices may yield an interaction map of the 
proteins of an organism. The invention provides an alternative to the yeast two hybrid 
system. 

IS 

Systems based on aS4 have a number of advantages over the other systems that are 
available, e.g. the conceptually similar yeast one and two-hybrid system. A bacterial host 
allows substantially larger repertoires to be obtained and thus a much larger molecular 
diversity to be screened. In particular, using combinatorial infection, the system of the 
20 invention allows the "crossing" of both a54-chimera repertoires with libraries of hybrid 
reporter constructs, thus permitting coevolution of DNA binding domains, and recognition 
sites, or coselection of DNA binding domains and target sites from genomic libraries. 

Because selection in the a54-based system is based on a positive readout, i.e. activation of 
25 transcription, it is less prone to false positives than other approaches relying on the 
inhibitory effect of the expressed DNA binding domains, like the transcription interference 
assay (Elledge S.J. et al (1989) Proc. Nat Acad Sci USA S6, 3689). In vivo selecdon in 
general may result in the selection of novel DNA binding domains that are more attuned to 
working under realistic conditions, including supercoiling of the recognition site, presence 
30 of a large excess of chromosomal DNA and high protein concentration. Another 
advantage of the system of the invention is that extremely low levels of the hybrid protein 
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appear to be sufficient to affect maximum activation of transcription. This is panicularly 
helpful in the case of DNA binding domains that are prone to aggregation. 

The a54-ba$cd systems of the present invention may be further adapted to take into 
5 account potential disadvantages of bacterial expression. For instance, E. coli expression 
may be suboptimal for large eukar>'Otic transcription factors. However, large eukaryotic 
proteins can often be split into smaller domains which retain function and are usually 
readily expressed in E. coli. 

10 According to the third configuration, a constitutively active a54 activator may be used to . 
screen a library of candidate DNA-bending polypeptides, preferably in a HIF negative host. 
Since the degree of activation by the aS4 activator may be dependent on DNA bending by 
additional factors, the levels of expression of the reporter gene will be modulated by the 
DNA-bending activity of the candidate DNA-bending polypeptides. 

IS 

The invention is further described, for die puiposes of illustration only, in the following 
examples. 

Examples 

20 

Ni£A from A. vinelandii is a well-studied member of die family of bacterial enhancers and 
it is a positive regulator of die expression of nitrogenase components in diazotrophs. It is 
inhibited by NifL in response to die presence of oxygen or ammonia. When expressed in K 
colU which lacks endogenous NifL or an equivalent, NifA is constitutively active. Because 
25 of die highly conserved nature of the activation mechanism of a54 RNA polymerase, NifA 
is a very strong activator of transcription in £. colL 

Like odier members of the fanuly of bacterial enhancer proteins, NifA is modular in 
arcUtecture, bodi structurally and functionally, comprising 3 domains, a N-terminal sensor 
30 domain , a central activation domain (AD), and a C-ierminal DNA binding dom^n (DBD). 
The central activation domain (AD) can activate transcription independent of DNA 
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binding if overexpressed. Thus the DBDs function appears to be primarily to increase the 
Activator domain's concentration in the promoter proximiw. 

We have exploited the modularity of the enhancer structure and swapped the natural NifA 
5 DNA binding domain (DBD) for heterologous DBDs and libraries thereof. Here we 
describe the activity of these NifA-chimeras in the activation of transcription from the a54 
dependent promoter nifH and hybrids thereof. 

Materials & Methods 

10 

Media Sl Reagents 

2xTY, MacConkey agar are described elsewhere (Miller J.H. (1972) Experiments in 
molecular genetics. Cold Spring Harbour, NY). Antibiotics were used at the following 
concentrations: Ampicillin 0.1 mg/ml, Chloramphenicol lOng/ml, Streptomycin 25|ig/nil. 

15 Min-lac medium was essentially M9 medium (Sambrook et a/.. Molecular Cloning; A 
Laboratory Manual, 2d ed. (1989) Cold Spring Harbor Laboratory Press. Cold Spring 
Harbor, N.Y) supplemented with ImM MgS04, 20^M CaCh, 2% (w/v) lactose, 2mg/ml 
casamino acids, 40ng/ml L-tryptophan, 5jig/ml thiamine and appropriate antibiotics. Min- 
lacX plates where essentially M9 plates supplemented 2% lactose, appropriate antibiotics 

20 and 40^g/ml X-gal (5-bromo-4-chloro-3-indolyl-b-D-galactopyranoside). 

Strains 

TGIAK was derived from TGI (Gibson T, J. (1984) Studies on the Epstein-Barr virus 
genome. University of Cambridge) using the genome integration strategy of Haldimann A. 

25 et aL (1996) Proc, Nat. Acad Set USA 93, 14361. Briefly. NifA (/C pneumoniae) 
residues 1-462 was amplified using Pfu polymerase (Stratagene) and primers 1 (5'- GAG 
TCA CTA ACG CAT ATG ATC CAT AAA TCC GAT TCG GAC -3'), 2 (5'- CGC GGA 
TCC AAG CGG CCG CTC ATT AGC GAT GGT TGA ACA GAA TCA C -3') cut with 
Ndel and BamHI and cloned into the genome targeting suicide vector pSK50D-uidA2 

30 (Haldimann, Op. CiV.) and transformed into the Pir"^ host strain BW23473 (Metcalf W.W. 
et al (1994) Plasmid 35, 1). Vectors were isolated and transformed into the Pir strain TGI 
harbouring the plasmid pINT-ts (Hasan N. et al (1994) Gene 150, 51). Chromosomal 
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integration was induced by a lemperaiure shift to 42^C. which leads to expression of X 
integrase from pINT-is and simultaneously stops its replication. Integrants where ideniified 
by ICanamycin resistance and screened for Nif coactivation. Once obtained TGIAK was 
grown routinely without antibiotic selection. 

5 

Constructs 

Chimeric constructs were based on pDB737 (Austin S. ei al (1994)7. BacterioL 176, 3460 
Buck M. et ai (1986) Nature 320, 374) encoding NifA {A. vinelandii) under the control of 

10 the T7 promoter in the plasmid pT7-7 (Tabor S. & Richardson C.C. (1985) Proc Natl 
AcadSci USA 82. 1074). E.\pression was by leakiness of the T7 promoter. Chimeras were 
constructed taking advantage of an unique Banll cutting site, in the linker region between 
the central domain of NifA and the DBD. GCN4 was amplified using Pfii polymerase 
(Stratagene) and primers 3 (5*- GCT GCC AGC GAG AGC CCG CCG CTC GCC GOG 

1 5 ATT GTG CCC GAA TCC AGT GAT CCT -3') and 4 (5^- GAG CTA AAG CTT TTA 
TTA GCG TTC GCC AAC TAA TTT CTT TAA TCT GGC -3') cut with BanD and 
Hind3 and ligated into pDB737 cut with Banll and Hind3. ERDBD was amplified using 
primers 5 (5'- GTC GAC AAC GAG AGC CCG CCG CTC GCC GCG GAA ACG CGT 
TAG TGC GCT GTT -3') TGC and 6 (5*- GGT CAG CGC GTG GAT CCT TAA CCA 

20 CCA CGA CGG TCT TTA CG-3') cut with Banll and BamHI and ligated into pDB737 
cut with Banll and BamHI. The vector p737Sl is derived from pDB737 by replacing the 
bla gene with aadA conferring streptomycin resistance and the insertion of a fl phage 
origin for packaging of the vector into filamentous phs^e particles. Briefly, aadA was 
amplified using primers 7 (5'- TCA GCG CAC GCT GAC GTC GTG GAA ACG GAT 

25 GAA GGC ACG AAC -3'), 8 (5'-CCG CCT GGA GGT GGC CAT TAT TTG CCG ACT 
ACC TTG GTG ATC TCG CC -3*) and cut with Aatll and ^4scI and ligated with 
pDB737 cut with Aatll and Seal. The resulting vector p737S was cut with AadL Clal. The 
fl on was amplified using primers 9 (5'- GCT GCC GAC TCG ATC GAT GAA TGG 
CGA ATG GCG CCT GAT GCG G -3'), 10 (5'-CCG GGT CGT GAC GTC AGT GTT 

30 GGC GGG TGT CGG GGC TGG C -3*) cut with AaUl, Clal and cloned into the cut 
p737S to give p737Sl. NifA-X chimera were transferred from pDB737 to p737Sl by 
digestion with Ndel . Hind3 (BamHI for Ni£A-ERDBD). 
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Reporter consirucis were derived from pACYC184 and the vector pMBI (Buck M. et al. 
(1986) Naiure 320, 374). Briefly the lac-operon (lacZYA) was amplified with primers 1 1 
(5'- GAG TCA ATT CGG GGA TCC CGT CGT ITT ACA ACG TCG TGA CTG G-3'), 

5 12 (5'- GAG TCA TTC TGG CCA GTC GAG CGC TCT GCC GGT GGT TAG -3 ) and 
cut with BamHI and Mscl. The nifH promoter segment from pMBI was amplified with 
primers 13 (5^- GAG TCA TTC AAG CTT GCG TGG AAT AAG ACA GAG GGG 
GCG-30. 14 (5'- GAG TCA TTC GGG ATC CCC GGA TTT ACC GAT ACC GCC 
TTT ACC -3') and cut with Hind3, BamHI and the 2 fragments simultaneously ligated 

10 with pACYC184 cut with Hind3 and BsaAl to give pMB3- The fl ori was amplified with 
primers 1 5 (5*- GCT GCC GAC TCG GCT AGC GAA TGG CGA ATG GCG CCT GAT 
GCG G -3 ), 16 (GCC GGG TCG CTT TAA AGT GTT GGC GGG TGT CGG GGC 
TGG C -3') and cut with Nhel and Dral and ligated into pMB3 cut with both Nhel, Xmnl 
togivepMB31. 

15 

Selection and screening 

Cells were couansformed either by simultaneous or sequential etectroporation with an 
expressor construct and a reporter construct and grown overnight vrith appropriate 
antibiotic selection at 340C in M9-Iac medium and plated out. p-gal expression was scored 
20 cither on MacConkey or Minlac-X-gal indicator plates or by ONPG enzyme assay of 
selected colonies (see below). 

Enzyme assay 

ONPG assays used to measure P-gal activity were, essentially as described by Kolmar H. 

25 et al. (1995) EMBO J 14, 3895. Briefly, 20^1 of an overnight culture is transferred to a 
microtitre well and lOO^il of chlorofom saturated Z-buffer (lOOmM NaHP04, ImM KCL, 
ImM MgS04, SOmjVI p-mercaptoethanol, pH 7.0 (Miller J.H. (1972) Experiments in 
molecular genetics. Cold Spring Harbour, NY) was added and the optical density at 600nm 
determined using an ELISA reader. Cells were lysed by addition of 30^1 Z-buffer with 

30 0.4% (w/v) SDS and incubated at ZO^C for 10 min. 50\i\ of Z-buffer with 4mg/ml O- 
nitrophenyl-P-D-galactopyranoside were added and the optical density at 420nM was 



wo 01/18244 



FCT/GBOa/03450 



26 

recorded automaiically ever>' 1 5s over a period of 60min. Specific p-gal aclivit>^ was 
calculated from the V^ax as in Miller (Op. CiL\ 

5 Example I: NifA chimera with heterologous DNA binding domains activate 
transcription but only from promoters with a cognate recognition site 

To investigate in what way transcription activation by NifA was dependent on the NifA 

DNA binding domain (DBD) and on native nif promoter structure, we prepared NifA- 
10 cWmeras in which NifA DNA binding domain (DBD) had been replaced by heterologous 

DBDs of diverse structural architectures. Inidally we explored DBDs which, like the NifA 
'wild type (wt) DBD bind to symmetrical DNA recognition sequences such as the basic 

leucine zipper (bZIP) DBD of the yeast transcription factor GCN4, the Zn-fmger domain 

of the human estrogen receptor DNA binding domain (ERDBD) and determined their 
15 capacity to activate transcription of a lacZ reporter gene in vivo from a hybrid nifH 

promoter, in which the NifA UAS had been deleted and replaced by recognition sites for 

the heterologous DBDs. 

In order to simplify comparison of transcription activation byoNifA chimeras with 
20 activation by wt NifA, all reporter constructs had a single DNA recognition site. The wt 
lufH promoter UAS contains three bona fide NifA recognition sites. Deletion of the two 
sites more distal to the promoter, however, did not appear to reduce transcription 
activation in our reporter under conditions tested. 

25 Transcription activation by NifA-chimeras was specific in that they only activated lacZ 
expression from hybrid-promoters bearing their cognate recognition sequences but not 
fiom control reporter constructs bearing wild type UAS or a non-cognate site (Fig. 3). In 
analogy to wt NifA the presence of two or more recognition sites (in phase, see below) did 
not increase activation by the Nif-GCN4 chimera. 

30 

Activity was also dependent on the phasing of the recognition site with respect to the 
promoter when the symmetric ATF/CREB recognition site for GCN4 was offset in 
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increments of 1 bp, optimal activity was observed when the ATF/CREB was centred on 
the same bp as the symmetric UAS. Presumably efficient contact with the RNA Pol 
holoenzyme requires that the activator be bound on the right face of the DNA. 

5 Transcription activation by NifA-chimeras appears to preserve fine specificity of isolated 
DBDs. Wild-t>T>e GCN4 binds with equal affinity to the symmetric ATF/CREB site as 
well as to the pseudo-symmetric AP-l site. Indeed, the NifA-GCN4 chimera showed 
identical levels of transcription activation in reporter constructs with either of these sites 
(Fig. 3). A NifA-ERDBD chimera showed strong activity on a reporter with its cognate 

10 ERE site but no activity above background levels with reporters bearing the similar GRE 
recognition site for the closely related glucocorticoid receptor DBD (Fig. 4). 

Example 2: Coexpression of wild-type NifA with NifA-chimeras boosts specific 
transcription activation by NifA chimeras in a specific and DNA independent manner 

IS 

The level of transcription activation by die Ni^-GCN4 and NifA-ERDBD chimeras was 
lower (ca. 10%) than for wt NifA. However, near wt levels of activity (up to 80%) were 
reached when wt NifA was coexpressed within the same cell as a "coactivator". 

20 The coactivation was independent of DNA binding, as NifA variants in which the DBD 
had been deleted (NifAAC) was found to be just as active as wt NifA. On die other 
coexpression of an isolated NifA central domain (both the DBD as well as the N-terminal 
sensor domain deleted (NifAANC)) failed to coactivate. NifA derived from different 
species showed greatly variable efficiencies as coactivators. NifA variants from K. 

25 pneumoniae (NifA Kp. NifAAC Kp) were almost three times as effective as NifA, while 
NifA variants from Rhizobium (NifA Rhl, NifA RJi2) were poorly active as coactivators 
(Fig. 5). 

The coactivator effect was found to enhance only specific transcriptional activation aiid not 
30 background levels of transcription from promoters with non-cognate recognition sites. We 
therefore constructed an £. coli straun, expressing NifAAC Kp (the K. pneumoniae NifA 
vnth its DBD deleted) from a weak promoter (phoB) from the chromosome (TG1:AK). 
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The coaciivaiion elTecl has analogies in eukarvotic transcripiion. for example ihe enhancer 
SpK in which isolated Spl aciivaiion domains can stimulate transcriptional activation by 
the DNA binding-form of Spl . a phenomenon lenmed "superaclivation". 

5 

Example 3: Tethering of Nif A chimera at the U AS is sufficient for activation, but strong 
activation requires correct positioning 

10 We also investigated transcription activation by NifA-chimeras with asymmetrical 
recognition sites such as the classic Zn-fmger ZiG68 as well as the DBD from p53. 

Both NifA-Zif268 and NifA-p53 chimeras activated transcription, but only at low levels (2 
- 5-fold above the background). However, when the Zif recognition site was duplicated, to 
13 give a symmetric palindromic site transcription activation increased substantially. Non-. 
palindromic duplication of the recognition site in tandem did not increase activation. 

Thus while simple tethering is sufficient for some activation, only bipartite binding 
appears to give a strong activation. Presumably, tethering only leads to an approximate 
20 positioning of the activation domain with respect to the RNA polymerase holoenzyme, 
thereby reducing the likelihood of a productive interaction. 



Example 4: Selection of active NifA-chimeras by lac complementation 

Using expression of the lac operon (lacZYA) from our reporter consuuct as the read-out 
of transcription activation allows the selection of active NifA-chimera on the basis of 
metabolic complementation of a Aiac strain, with lactose as the only carbon source. 
Initially we spiked populations of NifA-ERDBD with NifA-GCN4 at the ratios I/IO^, 
30 1/106 in the presence of the GCN4 cognate reporter ATF/CREB-nifH and grew 
populations overnight in minimal medium supplied with lactose. Pre- and post selection 
populations were scored by plating on MacConkey-lactose plates as well as by PCR 



WO01A18244 



PCT/GB0Oy0345O 



29 



screeninc. The results are summarised in Table I. Selection factors of up to 10.000 -fold 
per round were obser\'ed. . 



Table 1 iSeleciion factors for Nif selection by lac complementation 



5 



NifGCN4/NifERDBD 



Selection factor 



1/10^ 
1/10^ 
1/10^ 
1/10^ 



40 fold 
40 fold 
200 fold 



4000 fold 



Example 5: Selection of active NifA-chimeras by flow cytomeuy. 

10 Expression of p-galactosidase (lacZ) as the read-out of transcription activation allows the 
selection of active NifA-chimera on the basis of metabolic complementation of a Alac 
strain, grown on lactose as the only carbon source. However, metabolic selection 
predisposes the system to the generation of false positives. Presumably, the prolonged 
growth under metabolic selection selects for mutant promoters, active in the absence of a 

1 5 cognate enhancer. 

We have observed that that this only occurs for library sizes exceeding 10*. Indeed, others 
have found (using a related bacterial two-hybrid system) that it is not possible to retrieve 
positive clones from dilutions higher than 1/10* by metabolic lac selection (G. Karimova, 
20 et al., (1998) Proc Natl Acad Sci USA 95, 12532-7). As it is well known that bacteria can 
develop a mutator phenotype imder adaptive stress (P. D. Sniegowski, P. J. Gerrish, R. E. 
Lenski, (1997) Nature 387, 703-5), we conclude that it is preferable to separate the 
selection from the amplification (growth) step in order to reduce the likelihood of 
revertants. 



25 
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We thus replaced lacZ with the Aequorea victoria green fluoresceni protein (EGFP. F64L, 
S65T, ex488 nm. em527nm (the Clomech variant pGFPmui3.l. S65E , S72A. exSOlnm. 
emil liun FACS optimised variant was also tried, but found inferior) as the reponer gene. 
GFP has the advantage that cells can be grown first and then separated on the. basis of 
5 fluorescence using fluorescence activated cell sorting ( FACS). 

We prepared a trial library of mutant GCN4 bZIP DBDs (libran* size 10*) in which 5 key 
residues (Asn235, Ala238, Ala239. Ser242^ Arg243) of GCN4 interacting with ONA were 
randomised and selected it against a GFP hybrid reporter with the cognate ATF/CREB 

10 site. Library populations were grown overnight at 34^0 in non-fluorescent medium NFM 
(minimal medium supplied with 2%glucose. 0.2% casaminoacids.l2 ng/ml L-Trp). For 
FACS (Cytomaiion Mofo, 488 nm Laser. FL-1 530/40 filter) an 1 ml aliquot was diluted 
lOX in NFM and the top 1% fluorescent cell population was sorted into a 96 well plate at 
1 cell per well, and grovm up overnight at 34^C. Cell fluorescence of the grown up clones 

13 was measured by using a SPECTRAma.x'^GEMINI Oual-Scaiming Microplate 
Spectrofluoronieter (Molecular Devices), ex480, em320, (ctit-ofF 515 nm). Plasmids from 
fluorescent wells were sequenced afterwards. Pre- and post selection populations were 
also scored by PCR screening as well as by plating on min glu (M9 Minimal mediiuxi + 
glucose) plates and visualised usii^ fluorescence microscope. 

20 

10' cells were sorted in total, from v^ich 219 cells were in the top 1% fluorescent 
population and 1 32 of which were captured to the 96-well plates. 1 3 cells from these were 
fluorescent. Selected positives were checked by separating the mutant G(X4-bZIP DBD 
expressor plasmids. and re-transforming them together witii cognate and non-cognate 
25 reporter plasmids. None of the selected positives gave a fluorescent signal when 
combined non-cognate reporter plasmids, but all were fluorescent when combined with 
the ATF/Creb cognate reponer plasmid (which did not produce any fluorescence vyhen 
transformed on its own). 

30 This indicates that GFP selection indeed avoids the isolation of false positives. 
Furthermore, when the librarx' was checked prior to FACS sorting no fluorescent clones 
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were identified when plating >10' cells. 1/10 clones plated post selection were 
fluorescenL suggesting a selection factor in a single round in excess of 10*-foid, 

All publications mentioned in the above specification are herein incorporated by 
5 reference. All database sequences denoted by accession or gi numbers are likewise 
incorporated by reference. 

Various modifications and variations of the described methods and system of the 
invention will be apparent to those skilled in the art without departing from the scope and 

10 spirit of the invention. Although the invention has been described in connection with 
specific prefenred embodiments, it should be understood that the invention as claimed 
should not be unduly limited to such specific embodiments. Indeed, various modifications 
of the described modes for carrying out the invention which are obvious to those skilled in 
molecular biology or related fields are intended to be within the scope of the following 

15 claims. 
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Claims 



1. A method tor detecting a protein-nucleic acid interaction between a acid molecule 
and a protein molecule, comprising the steps of: 

5 a) providing one or more hybrid a54 activator proteins comprising a heterologous 

nucleic acid binding sequence and a constitutively active g54 transcription activating 
domain; 

b) providing one or more nucleic acid molecules comprising a binding site for the 
nucleic acid binding sequence and a binding site for a54 RNAP« which directs the 

10 expression of a reporter gene and leads to upregulation thereof in response to activation by 
the a54 transcription activating domain: and 

c) detecting expression of the reporter gene. 

2. A method according to claim 1, comprising providing a repertoire of hybrid a34 
15 activator proteins, said repertoire comprising a plurality of different nucleic acid binding 

sequences. 

3. A method according to claim 1 , comprising providing a repertoire of hybrid nucleic 
acid molecules, said repertoire comprismg a plurality of different binding sites for the 

20 nucleic acid binding sequence. 

4. A method according to claim 1, comprising providing both a repertoire according 
to claim 2 and a repertoire according to claim 3. 

25 5. A method for detecting a protein-protein interaction, comprising the steps of: 

a) providing a first hybrid protein comprising a nucleic acid binding sequence and a 
first polypeptide sequence bait; 

b) providing a second hybrid protein comprising a prey polypeptide sequence and 
constitutively active a54 transcription activating domain; 

30 c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 

binding sequence and binding site for a54 RNAP which directs the expression of a 
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reporter gene and leads to upregulaiion thereof in response to activation by the a54 
transcription activating domain; 

d) incubating the first and second hybrid proteins together with the nucleic acid 
molecule such that the prey and bait pol>peptide sequences may bind, thereby fomiing a 

3 hybrid protein comprising both a nucleic acid binding sequence and a a54 transcription 
activating domain: and 

e) detecting expression of the reporter gene. 

6. A method according to claim 5. comprising providing a repertoire of first hybrid 
1 0 proteins, said repertoire comprising a plurality of bait pol>'peptides. 

7. A method according to claim 5, comprising providing a repertoire of second hybrid 
proteins, said repertoire comprising a plurality of prey polypeptides. 

15 8. A method according to claim 5, comprising providing a repertoire of first hybrid 
proteins and a repertoire of second hybrid proteins, said repertoires comprising a plurality 
of bait and prey polypeptides. 

9. A method for screening a repertoire of candidate DNA-bending 
20 polypeptides, comprising the steps of: 

a) providing a repertoire of candidate polypeptide factors with potential to induce 
bending of DNA; 

b) providmg a a54 activator protein comprising a nucleic acid binding sequence 
and a a54 transcription activating domain; 

25 c) providing a nucleic acid molecule comprising a binding site for die nucleic acid 

binding sequence and binding site for a54 RNAP which directs the expression of a 
reporter gene and leads to upregtilation thereof in response to activation by the a54 
transcription activating domain; 

d) incubating the repertoire and a54 activator together with the nucleic acid 

30 molecule in a HIF* host cell, such that a54 activator and the nucleic acid molecule may 
interact, and transcription activated firom the a54 RNAP binding site in a manner 
dependent on DNA bending by the polypeptide factors; and 
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e) detecting expression of the reporter gene. 

10. A method according to any preceding claim, wherein the polypeptides are obtained 
by expression within a bacterial host cell. 

5 

11. .A. method according to claim 10. wherein the pol>pepiides are encoded one or 
more libraries of nucleic acid vectors. 

12. A method according to claim 11, wherein a first library of nucleic acid vectors 
10 encodes a first chimeric gene, said gene comprising a nucleic acid sequence that encodes a 

nucleic-binding domain and a nucleic acid sequence encoding a first (bait) test protein or 
protein fragment in such a manner that the first test protein is expressed as part of a hybrid 
protein with the nucleic acid-binding domain. 

15 13. A method according to claim 1 1, wherein a second library of nucleic acid vectors 
encodes a second chimeric gene, said gene comprising a nucleic acid sequence that 
encodes a aS4 transcriptional activation domain and a second (prey) test protein or protein 
fragment into the vector, in such a manner that the second test protein is capable of being 
expressed as part of a hybrid protein with the transcriptional activation domain. 

20 

14. A method according to any preceding claim, wherein the a54 transcriptional 
activator is selected from the group consisting of: 

dbj|BAA16379.1| (D90877) FORMATE HYDROGENLYASE TRANSCRIPTIONAL 
ACTIVATOR; 

25 emb|CAA26472.l| (X02616) pot. Nifa gene product (aa 1-484) [Klebsiella pneumoniae]; 

emb|CAA53584.1| (X75972) anfa [Rhodobacter capsulatus]; 

emb|CAA92413.1| (Z68203) nifa homologue [Rhizobium sp.]; 

emb|CAA93242.1| (Z69251) mopr [Acinetobacter calcoaceticusl; 

emb|CABS3157.1| (X07567) nifal (Rhodobacter capsulatus]; 
30 emb|CAB56537.1| (AJ249642) response regulator [Pseudomonas stuizerij; 

gb|AAA58220.1| (U18997) ORF_o532 [Escherichia coli]; 

gb|AAA99303. 1 1 (L43064) regulatoiy protein [Pseudomonas aeruginosa]: 
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gb|AAB9 1397. 1 |(Af 033203) nifaii protein [Rhodobacier capsulatus]; 
gbIAAC05586.l| (AF006075) regulatory protein [Bacillus subtilis]; 
gb|AAC37124.l| (LSI 176) fleq [Pseudomonas aeruginosa]; 
gb|AAC45640.1| (AF010585) puiaiive sigma 54 activator [Caulobacter crescentus]; 
5 gb|AAC46367. 1 1 ( AFO 14113) two-component response regulator p/ibrio cholerae]; 

gb|AAD34591 . I|AF 1 45956^1 (AF 145956) transcriptional activator nifa [Rhodospirillum 
rubrum]; 

gb|AAD38416.l| (AF155934) nifa [Alcaligenes faecalis]: 
gb|AAF28395.1| (AF069392) flam {Vibrio parahaemolyticus]; 
10 gb|AAF33506.1| (AFl 70176) Salmonella typhimiirium transcriptional regulatory protein; . 
gb|AAF6 1932.1 1 (AF230804) sigma-54 activator protein Act I [Myxococcus xanlhus]; 
gb|AAF85342.1|AE004061_7 (AE004061) two-component system, regulatory protein 
pCyiella fastidiosa]; 

gb|AAF94676.1| (AE004230) sigmao4 dependent response regulator [Vibrio cholerae]; 
1 5 gb|AAF95280. 1 1 (AE004286) sigma-54 dependent response regulator [Vibrio cholerae]; 
gb|AAF96095.1| (AE004358) sigma-54 dependent transcriptional regulator [Vibrio 
cholerae]; 

gb|AAG01527.1|AF288483_l (AF288483) nifa [Azospirillum brasilense]; 

pir||A48291 ornithine decarboxylase inhibitor - Escherichia coli; 
20 pirl|B49940 nitrogen regulator I homolog - Escherichia coli; 

pir||C70320 transcription regulator nifa family - Aquifex aeolicus; 

pirl|C70396 transcription regulator ntrc family - Aquifex aeolicus; 

pirt|C70454 transcription regulator ntrc family - Aquifex aeolicus; 

pirl|D70315 transcription regulator ntrc family - Aquifex aeolicus; 
25 pir||H6958 1 transcription activator of acetoin dehydrogenase operon acor - Bacillus 

subtilis; 

pii||I39719 nitrogen regulatory protein - Agrobacterium tumefaciens; 
pirl|JC5471 regulatory protein nifa - Azospirillum lipofetum; 
pir||T08624 probable ntrc-type response regulator - Eubactcrium acidaminophilum; 
30 sp|P03027|NIFA_KLEPN NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P09570(NIFA.AZOVI NIF-SPECIFIC REGULATORY PROTEIN; 
sp|Pl2627|VNFA^AZOVI NITROGEN FIXATION PROTEIN VNFA; 
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sp|Pl4375|HYDG_ECOLl TRANSCRIPTIONAL REGULATORY PROTEIN HYDG; 
sp|P21712|YFHA_ECdLI HYPOTHETICAL 49a KD PROTEIN IN GLNB-PURL 
INTERGENIC REGION (ORFXB); 

sp|P24426|NIFA_RHILT NIF-SPECIFIC REGULATORY PROTEIN; 
5 sp|P25852|HYDG_SALTY TRANSCRIPTIONAL REGULATORY PROTEIN HYDG; 
sp|P27713|NIFA_HERSE NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P30667|NIFA_AZOBR NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P38035|RTCR_ECOLI TRANSCRIPTIONAL REGULATORY PROTEIN RTCR; 
sp|P54929INIFA_AZOLI NIF-SPECIHC REGULATORY PROTEIN; 
10 sp|P56266|NIFA_KLEOX NIF-SPECIFIC REGULATORY PROTEIN; 

sp|Q06065|ATOC_ECOLI ACETOACETATE METABOLISM REGULATORY 
PROTEIN ATOC (ORNITHINE/ARGININE; 

sp|Q46802|YGEV_ECOLI HYPOTHETICAL SIGMA-54-DEPENDENT 
TRANSCRIPTIONAL REGULATOR IN; 
15 sp|Q53206|NIFA^RHISN NIF-SPECfflC REGULATORY PROTEIN; and 

sp|Q9ZIB7fTYRR_ERWHE TRANSCRIPTIONAL REGULATORY PROTEIN TYRR. 

15. A method according to any one of claims 1 to 14, wherein the a34 transcriptioiial 
activator is the Nif A transcriptional activator or the PspF transcriptional activator. 

20 

16. A method according to any one of claims 1 to 14, wherein the hybrid oS4 
transcriptional acdvator is NifA and activation resulting from NifA-a34 RNAP interaction 
is enhanced by the coexpression of wild-type or mutant NifA. 

23 17. A method according to claim 1 6, wherein the hybrid (t54 transcriptional activator is 
Ni£^ from Azotobacter vinelandii, and the wild-type or mutant NifA is NifA from 
Klebsiella pneumoniae. 

18. A method according to any preceding claim, wherein the nucleic acid molecule ' 
30 comprises a binding site for a factor which induces DNA bending. 



19. 



A method according to claim 18, wherein the fector is integration host factor (IHF). 
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20. A method according to any one of claims 1 to 17, wherein the nucleic acid 
molecule comprises DNA that is intrinsically bent 

5 21. A method according to any preceding claim, wherein the nucleic, acid molecule 
comprises a nifH promoter from A. vinelandii driving a reporter gene. 

22. A method according to any preceding claim, wherein the reporter gene is selected 
from the group consisting of metabolic markers such as the lac operon (lacZ, lacY and 

10 lacA); proteins conferring a fluorescent phenotype, such as GFP; proteins conferring 
antibiotic resistance, such as Zeo; and proteins conferring another selectable property. 

23. A method according to any preceding claim, which is carried out in the presence of 
a compound which modifies protein-protein or protcin-DNA interaction. 

15 

24. A method according to claim .22, wherein the compound is selected from the group 
consisting of molecules which alter the structure of the DNA-binding protein; molecules 
which alter the structure of DNA; and molecules which modify protein-protein 
interactions. 

20 

25. A method according to any preceding claim, which is carried out in vivo, 

26. A method according to claim 25, wherein the in vivo host is E, coli. 

25 27. A nriethod according to any one of claims 1 to 24, which is carried out in vitro. 



wo 01/18244 



PCT/GB0W03450 



1/5 



FIGURE 1 




NifA binding 
site (UAS) 



Sensor and 
activation domain 

DNA-binding 
domain 



UAS 



cy54 
promotor 



nif gene 
OFF 




RNA-polymerase 
g54 holoenzyme 



B 



GCN4 
cognate site 




Nlf-GCN4 
hybrid 



lacZ 




SUBSTITUTE SHEET (RULE 26) 



wo 01/18244 



PCr/GB0CV034SO 



2/5 




SUBSTITUTE SHEET (RULE 26) 



wo 01/18244 



PCT/GBOO/03450 



3/5 
. Figures 




□ NifAwt/UAS 

□ NifGCN4/CREB+NifADC 

□ NifGCN4/AP-l +Ni(ADC 
aNifGCN4AJAS +NiiADC 



SUBSTITUTE SHEET (RULE 26) 



wo 01/18244 PCT/GB0<W03450 

4/5 
Figure 4 
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Selection System 

The present invention relates to a screening system useful for screening repertoires of 
DNA binding domains. In particular the invention relates to a screening system based on 
5 transcriptional activators of bacterial a54-dependent promoters. 

The majority of proteins involved in cellular functions do so by interacting with other 
proteins or nucleic acid sequences within the cell. Several approaches have been described 
that allow the in vivo selection of nucleic acids which express polypeptides capable of 

10 binding to proteins or DNA in the cell. Arguably the most powerful approaches are the 
yeast one- and two hybrid s\'siems (Fields S. & Song O. f 1989) Nature 340. 245; see US 
Patent 5,283,173, incorporated herein by reference in its entirety) for the screening, of 
protein-DNA and protein-protein interactions, respectively. However, the two-hybrid 
system requires an eukaryotic host and consequently the diversity that can be screened is 

1 5 limited. Furthermore the system notoriously suffers from an abundance of false positives. 

Larger molecular repertoires can be prepared, in bacterial hosts and a number of bacterial 
systems for the screening of protein-protein and protein-DN A interactions have also been 
reported. Two systems have been put fonvard in which the polypeptide chain of an enzyme 
20 is expressed in two parts fused to two candidate polypeptides, and in which interaction 
between the candidate polypeptides reconstitutes the function of the enzyme (Karimova G. 
et al (1998) Proc. Nat, Acad Sci USA 95, 5752; Pelletier J.N. et al (1998) Proc Nat, 
Acad Sci USA 95, 12141). 

25 Several in vivo screens for DNA-binding proteins have also been reported (reviewed in 
Mossing M.C., Bowie J.U. & Sauer R.T. (1991) Methods EnzymoL 208, 604; Elledge S.J. 
et al (1989) Proc. Nat, Acad Set USA 86, 3689). Each of these methods involves the 
blockage of a hybrid a70 promoter by die DNA binding protein. Repression of the 
promoter either prevents the production of conditionally toxic gene or alleviates repression 

30 of an antibiotic gene by transcriptional interference. The transcriptional interference assay 
(Elledge et al) has been used successfully in one case to select DNA binding proteins with 
altered specificity (Sera T. & Schultz P.O. (1996) Proc, Nat, Acad, Sci USA 93, 2920). 
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Another a70-based s>stcm uiilises recruiimenl of the polymerase lo the promoter by way 
of a protein-protein interaction between a protein domain fiised lo the RNA polymerase 
a subunit and another fused to the lambda repressor bound immediately upstream of the 
RNA polv-merase promoter binding site (Dove S.L.. Joung J.K.. Hochschild A. (1997). 
5 Nature 386, 627). By replacing the lambda repressor DNA binding domain with a libraiy 
of Zn-finger domains, specific DNA binding Zn-finger domains were selected (Joung J.K., 
Ramm Pabo CO. (2000) Proc Nail Acad Sci US A, 97. 7382) 

The alternative holoenzyme form of bacterial RNA polymerase (RNAP) contains the a 54 
10 fector (a 54-RNAP). As has been previously shown, this polymerase, in most cases, fonns 
a closed complex with the promoter. Unlike o70 promoters at which the RNA polvmerase 
is bound in an active form and is largely controlled by repression, the a54 RNA 
polymerase holoenzyme is transcriptionally incompetent and is unable to iiiitiatc 
transcription by itself. Initiation of transcription requires the presence of a transcriptional 
15 activator that catalyses the isomerisation of the closed promoter complex to an open one. 
Typically, activator proteins bind to a specific upstreana activation sequence (UAS) located 
80 to 200 bp upsU-eam of the a 54 core promoter. The function of the UAS is to tether the 
activator in the right position and to bring it in the vicinity of tfie promoter in order to 
increase the efficiency of interaction between the a 54 RNAP and the activator. 
20 Transcriptional activators of a54 dependent promoters have been called bacterial 
enhancers because their mechanism of activation is superficially similar to the activation of 
transcription by enhancer proteins in eukaryotes (Kustu S. et al ( 1 991 ) Trends Biochem Sci 
16, 397). 

25 Conversion of the ci 54 RNAP into an active form is catalysed by the binding of an 
enhancer protein coupled to hydrolysis of ATP. This unusual mechanism accounts for the 
low level of background transcription and the enormous difference (lO^-lO^) between on 
and off states in the strongest a54 promoters effected by a single factor. In comparison, 
activators of o70 promoters such as CAP or Xcl increase transcription levels usually by 

30 less than iO-fold. 
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Transcriptional activators of a54 promoters (also known as enhancer-binding proteins or 
EBPs) share a common structure (see Morreii and Segovia, (1993) J. Bacteriol. 6067- 
6074) comprising a non-conser\'ed N-terminal domain which has a putative regulatory 
function, a central domain which is responsible for transcriptional activation, and a C- 
5 terminal DNA binding domain which binds the relevant UAS in the target gene. The 
domains are modular: the central , and N-terminal domains together are capable of 
constitutive activation of a54 RNAP when overexpressed. At least in some cases» the 
isolated DNA binding domain is capable of specifically binding its DNA recognition site. 

10 In many instances, interaction between o54 RNAP and the activator is enhanced by a 
cellular factor which promotes DNA bending between the UAS and the a54 promoter 
(Freundlich et al, (1992) Mol. Microbiol. 6:2557-2563). This factor, known as integration 
host factor (IHF) aicts to promote transcription from a54 promoters. 

1 5 Summary of the Invention 

We provide herein a novel screening system which is based on transcriptional activators of 
a54-based promoters. 

20 According to a first aspect of the invention, therefore, there is provided a method for 
detecting a protein-nucleic acid, interaction between a acid molecule and a protein 
molecule, comprising the steps of: 

a) providing one or more hybrid a54 activator proteins comprising a heterologous 
nucleic acid binding sequence and a constitutively active a54 transcription activating 

25 domain; 

b) providing one or more nucleic acid molecules comprising a binding site for the 
nucleic acid binding sequence and a binding site for o54 RNAP, which directs the 
expression of a reporter gene and leads to upregulation thereof in response to activation by 
the o54 transcription activating domain; and 

30 c) detecting expression of the reporter gene. 
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The inveniion provides a reporter system which is characterised by very low levels of 
background expression, since the a54 polymerase is iranscripiionally incompelenl in the 
absence of a a54 transcriptional activator. Since, at physiological concentrations, the 
binding of the transcriptional activator to the nucleic acid is required in order to activate 
5 transcription by ct54 RNAP. the system of the invention may be used as a tool for 
investigating and/or screening protein/nucleic acid interactions exploiting the reporter gene 
read-out. 

In the first aspect of the invention, either the nucleic acid binding protein or the nucleic 
10 acid molecule may be provided in the form of a repertoire of molecules. Repertoires of 
hybrid a54 activator proteins preferably are partially or completely randomised at least in 
the heterologous nucleic acid binding sequence. This allows selection from the library of 
molecules having desired nucleic acid binding characteristics. 

15 Repertoires of nucleic acid molecules advantageously are partially or completely 
randomised in the binding site for the nucleic acid binding sequence of the o54 activator 
protein. This allows selection of nucleic acid molecules having desired bindmg sites for 
the chimeric activators. 

20 In a second aspect of the inveniion. there is provided a system for selecting protein-protein 
interactions based on the constitutively active hybrid o54 activators described above. The 
system according to ihe invention is conceptually similar to the yeast two-hybrid system. 

Accordingly, there is provided a method for detecting a protein-protein interaction, 
25 comprising the steps of: 

a) providing a first hybrid protein comprising a nucleic acid binding sequence and a 
first polypeptide sequence bait; 

b) providing a second hybrid protein comprising a prey polypeptide sequence and 
constitutively active a54 transcription activating domain; 

30 c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 

binding sequence and binding site for o54 RNAP which directs the expression of a 
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reporter gene and leads to upregulation thereof in response to activation by the a54 
transcription activating domain: 

d) incubating die first and second hybrid proteins together with the nucleic acid 
molecule such that the prey and bait polypeptide sequences may bind, thereby forming a 

5 hybrid protein comprising both, a nucleic acid binding sequence and a a54 transcription 
iactivating domain; and 

e) detecting expression of the reporter gene. 

As will be apparent to those skilled in the art, reference to a ^binding site" for the nucleic 
10' acid binding sequence includes.the provision of several appropriately spaced binding sites 
in the nucleic acid molecule. 

As with the yeast two-hybrid system, in which a modular transcription factor is assembled 
though binding of DNA binding domain/bait and transcription activating domain/prey 

IS hybrids, the association of the nucleic acid binding sequence and the 034 transcription 
activating domain through the bait/prey interaction allows the detection of, and screening 
for, protein-protein binding interactions in vivo and in vitro. Advantageously, the bait 
and/or prey polypeptide sequences are provided in the form of repertoires, which may be 
partially or completely randomised. This allows selection of prey polypeptides based on 

20 their ability to form intei^ctions with a desired bait (or vice versa). As the assay may be 
conducted in v/vo, in a bacterium, the invention permits the detection of in vivo binding 
interactions between polypeptides in bacteria. 

It will be apparent that the hybrid proteins useful in the methods of the invention are 
25 advantageously provided in the form of nucleic acid vectors or libraries thereof c^^able of 
expressing said proteins in a host bacteritmi. Advantageously, the vector(s) include first 
and second chimeric genes which encode the hybrid proteins of the invention. Preferably, 
the vectors also include means for replication in bacteria. Also included may be one or 
more marker genes, the expression of which in the bacterium pennits selection of cells 
30 containing the vector(s) from cells that do not contain the vector(s). Preferably, the 
vector(s) are plasmid(s). 
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In a third aspect, the invention provides a method for screening a repeitoire of 
candidate DNA-bending polypeptides, comprising the steps of: 

a) providing a repertoire of candidate polypeptide factors with potential to induce 
bending of DNA; 

5 b) providing a a54 activator protein comprising a nucleic acid binding sequence 

and a o54 transcription activating domain; 

c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 
binding sequence and binding site for cr54 RNAP which directs the expression of a 
reporter gene and leads to upregulatbn thereof in response to activation by the a54 

10 transcription activating domain; 

d) incubating the repertoire and c54 activator together with the nucleic acid 
molecule in a HIF* host cell, such that a54 activator and the nucleic acid molecule may 
interact, and transcription activated from the a54 RNAP binding site in a manner 
dependent on DNA bending by the polypeptide factors; and 

1 5 e) detecting expression of the reporter gene. 

It is known that activation by a54 activators may be regulated by factors which induce 
DNA bending in the target gene. For example, the host factor IHF is known to potentiate 
c54 activation; moreover, it may be replaced by alternative DNA bending polypeptides, or 
20 by intrinsically bent DNA. 

The invention moreover provides methods for development of improved a54 activator- 
based tools. 

25 The first chimeric gene includes a nucleic acid sequence that encodes a nucleic-binding 
domain and a first (bait) test protein or protein fragment in such a manner that the first test 
protein is expressed as part of a hybrid protein with the nucleic acid-binding domain. 

The second chimeric gene also includes a promoter and a transcription termination signal 
30 to direct transcription. The second chimeric gene moreover includes a nucleic acid 
sequence that encodes a a54 transcriptional activation domain and a second (prey) test 
protein or protein fragment into the vector, in such a manner that the second lest protein is 
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capable of being expressed as pan of a hybrid protein, with the transcriptional activation . 
domain. 

The invention moreover provides kits for practising the invention, which kits 
5 advantageously comprise a container, two vectors, and a host cell. The first vector contains 
a promoter and may include a transcription termination signal functionally associated with 
' the first chimeric gene in order to direct the. transcription of the. first chimeric.gene. The 
chimeric gene advantageously comprises one or more unique restriction site(s) to insert a 
nucleic acid sequence encoding a test bait polypeptide. The kit also may also include a 
10 second vector which contains a second chimeric gene, optionally comprising one or more 
imique restriction site(s) to insert a nucleic acid sequence encoding the prey polypeptide: 
. alternatively, the second chimeric gene may be present on the same vector as the first 
chimeric gene. 

13 Brief description of the Figures 

Figure 1 A is a schematic representation of a54 RNAP activation by a54 activator NifA. 

Figure IB is a schematic representation of the invention, in which the a34 DNA binding 
20 domain is replaced with a heterologous GCN4 DNA binding domain.. 

Figure 2 is a schematic representation of the first aspect of the present invention, in which 
a library of DNA binding domains is screened together with a library of DNA binding 
domain binding sites to identify protem:bNA binding pairs. 

25 

Figure 3 shows the activation of transcription by NifA-chimera as expressed as percent of 
wt activity (NifAAJAS). Nif-GCN4 (in presence of the NifAAC coactivator (NifADC)) 
show close to wi activity. Equal activity is observed for the two distinct GCNS DNA 
recognition sites (ATF/Creb and AP-1). Less than 1% wt activity is observed with a non- 
30 cognate reporter such as one bearing die wt nifH UAS. 
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Figure 4 shows the activation of transcription by NifA-chimeni as expressed as percent of 
\vt activity (NifAAJAS). Nif-ERDBD (in presence of the NifAAC coactivator (NifADC)) 
shows ca. 80% of wt activit>'. Ver>' little activation is.observcd with a non-cognate reporter 
bearing the DNA recognition site (ORE) for the closely related Glucocorticoid receptor. 

^ 5 

Figure 5 shows the coactivation by different NifA variants as expressed, as percent of wt 
activity (HifAwt). NifA from Klebsiella pneumoniae (NifAKp) is superior to al others, 
even exceeding wt activity (up to 160%).. NifAKp. with its DNA domain deleted 
(NifAACKp (NIfADCKp)) is almost as active. . ; ^ 

10 

Detailed Description of the Invention . 

Unless defined otherwise, all technical and scientific terms used herein have the same 
1 5 meaning as conunonly understood by one of ordinary skill in the art (e.g., in bacterial cell 

culture, molecular genetics, nucleic acid chemistry, protein chemistry and biochemistry). 

Standard techniques are used for molecular, genetic and biochemical methods (see 

generally, Sambrook et fl/.;Molecular Cloning: A Uboratory Manual, 2d ed: (1989) Cold 

Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et al.. Short 
20 Protocols in Molecular Biology (1999) 4* Ed, John Wiley & Sons, Inc. which are 

incorporated herein by reference), chemical methods, pharmaceutical formulations and 

delivery and treatment of patients. 

A: Nucleic Acids and Proteins 

25 . 

As used herein, "nucleic acid" refers to any natural nucleic acid, including RNA and DNA 
as well as synthetic nucleic acid comprising modified or synthetic bases, and mixtures of 
modified or synthetic bases with natural bases. Such modified and/or synthetic bases may 
be referred to as derivatives of DNA or RNA. Preferably, "nucleic acid" refers to DNA. 



30 
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The invention includes the use of modified and/or artificial "nucleic acids". A number of 
modifications have been described that alter the chemistry of the phosphodiester 
backbone, sugars or heterocyclic base components of nucleic acids. 

5 Among useful changes in the backbone chemistry are phosphorodiioates; 
phosphorodithioates. where both of the non-bridging oxygens are substituted with 
sulphur; phosphoroamidites; alkyl phosphotriesters and boranophosphates. Achiral 
phosphate derivatives include 3*-0'-y-S^phosphorbthioate, .3'-S-5'-0-phospborothibate, 
3'-CH2-5 -O-phosphonalc and 3'-NHo*-0-phosphoroamidatc. Peptide nucleic acids 

10 replace the entire phosphodiester backbone with a peptide linkage. 

Sugar modifications are also known. The a-anomer of deoxyribose may be used, where 
the base is inverted with respect to the natural p-anomer. The 2'-OH of the ribose sugar 
may be altered to form r-O-methyl or 2 -0-aIlyl sugars, which provides resistance to 
15 degradation without comprising affinity. 

Modification of the heterocyclic bases must maintain proper base pairing. Some useful 
substitutions include deoxyuridinc for deoxythymidine; 5-methyl-2'-deoxycytidine and 
3-bromo-2'-deoxycytidine for deoxycytidine. 5-pn>pynyl-2-deoxyuridine and 
20 5-propynyl-2'-deoxycytidine have been shown maintsun biological activity - when 
substituted for deoxythymidine and deoxycytidine, respectively. 

As used herein, the term "protein" includes single-chain polypeptide molecules as well as 
multiple-polypeptide complexes where individual constituent polypeptides are linked by 

25 covalent or non-covalent means. As used herein, the terms "polypeptide" and "peptide" 
refer to a polymer in which the monomers are amino acids and are joined together through 
peptide or disulphide bonds. The term domain also refers to polypeptides and peptides 
having biological function. A peptide useful in the invention vrill have a binding or 
transcription activating capability, i.e., vydth respect to binding to nucleic acids, other 

30 proteins or polypeptides, and activation of a54 RNAP transcription. It also may have 
another biological function that is a biological function of a protein or domain from which 
the peptide sequence is derived. 
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A hybrid protein is a protein or polypeptide which comprises constituent parts derived 
from at least r\vo naturally-occurring' or artificial proteins. In particular* it may comprise 
the DNA-binding domain of one protein and the protein-binding or transcription activating 
5 domain of a second protein. 

B: a54 Activators 

Activators of a54 transcription are well known and have been reviewed, for example, by 
10 Buck et aL J Bacteriol. 2000 Aug;182(15):4129-36; Studholme and Buck, FEMS 

Microbiol Lett. 2000 Mav l;186(l):l-9: Shingler. Mol Microbiol. 1996 Feb;19(3):409-16; 

Goosen and van der Putte, Mol Microbiol. 1995 Apr; 16(1): 1-7; Merrick, Mol Microbiol. 

1993 Dec;10(5):903-9; and others. A family of such activator proteins has been defined, 

and its members found to share homology in the central (catalytic) domain which is 
1 5 responsible for a54 RNAP activation. 

Members of the family include the following (the numbers are GenBank accession 
numbers) 

20 dbj|BAAl6379.!r(D90877) FORMATE HYDROGENLYASE TRANSCRIPTIONAL 
ACTIVATOR. 

emb|CAA26472.1| (X02616) pot. NifA gene product (aa 1-484) [Klebsiella.pneumoniae] 

emblCAA53584.1| (X75972) anfA [Rhodobacter capsulatus] 

emb|CAA9241 3. 1 1 (Z68203) NifA homologue [Rhizobium sp.] 
25 erab|C AA93242. 1 1 (Z6925 1 ) MopR [Acinetobactcr calcoaceticus] 

emb|CAB53 1 57. 1 1 (X07567) NifAl [Rhodobacter capsulatus] 

erab|CAB56537.1| (AJ249642) response regulator [Pseudomonas stutzeri] 

gb|AAA58220.1| (U18997) ORF_o532 [Escherichia coli] 

gblAAA99303-ll (L43064) regulatory protein [Pseudomonas aeruginosa] 
30 gb|AAB91 397. 1| (AF033203) NifAH protein [Rhodobacter capsulatus] 

gb|AAC05586.1t (AF006075) regulatory protein [Bacillus subtilis] 

gb|AAC37124.1| (L81176) FleQ (Pseudomonas aeruginosa] 
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gb|AAC45640.l| (AF010585) putative sigma 54 activator (Caulobacier crescentus) 
gb|AAC46367.1| (AF0141 13) two-component response regulator [Vibrio cholerae) 
gb|AAD3459l . I |AF145956_1 (AF 145956) transcriptional activator NifA 
[Rhodospiriilum rubrum] 
5 gb|AAD384l6.l| (AF155934) NifA [Alcaligenes faecalis] 
gb|AAF28395.1| (AF069392) FlaM [Vibrio parahaemolyticus] 

gb|AAF33506.1| (AF170176) Salmonella typhimurium transcriptional regulatory protein 
gb|AAF6 1932.1 1 (AF230804) sigma-54 activator protein Acil [Myxococcus xanthusl 
. gb|AAF85342.1|AE00406L7 (AE004061) two-component system, regulatory protein 

10 [Xylellafastidiosa] 

gb|AAF94676. 1 1 (AE004230) sigma-54 dependent response regulator [Vibrio cholerae] 
gb|AAF95280.1| (AE004286) sigma-54 dependent response regulator [Vibrio cholerae] 
gb|AAF96095.1|(AE004358) sigma-54 dependent transcriptional regulator [Vibrio 
choIeRMc] 

15 gb|AAG01527.1|AF288483^1 (AF288483) NifA [Azospirillum brasilense] 
pirj|A48291 ornithine decarboxylase inhibitor - Escherichia coli 
. pir||B49940 nitrogen regulator 1 homolog - Escherichia coli 
pirj|C70320 transcription regulator NifA family - Aquifex aeolicus 
pir||C70396 transcription regulator NtrC family - Aquifex aeolicus 
20 piri|C70454 transcription regulator NtrC family - Aquifex aeolicus 
pir|P70315 transcription regulator NtrC family - Aquifex aeolicus 
;.pir|lH69581 transcription activator of acetoin dehydrogenase operon acoR - Bacillus 
subtilis 

pii|11397l9 nitrogen regulatory. protein - Agrobacterium mmefaciens 

25 pirl|JC5471 regulatory protein NifA -Azospirillum lipofenmi 

pirHT08624 probable NtrC-type response regulator - Eubacterium acidaminophilum 
sp|P03027lNIFA^kLEPN NIF-SPECIFIC REGULATORY PROTEIN 
splP09570[NIFA^A2Oyi NIF-SPECIFIC REGULATORY PROTEIN 
sp|P12627|VNFA.AZOVI NITROGEN FIXATION PROTEIN VNFA 

30 sp|P14375|HYDG_ECOU TRANSCRIPTIONAL REGULATORY PROTEIN HYDG 
sp|P21712|YFHA_ECOLI HYPOTHETICAL 49.1 KD PROTEIN IN GLNB-PURL 
INTERGENIC REGION (ORFXB) 



wo 01/18244 



PCT/GB0(V0345O 



sp|P24426|NIFA^RHlLT NIF-SPECIFIC REGULATORY PROTEIN 
sp|P25852|HYDG^SALTY TRANSCRIPTIONAL REGULATORY PROTEIN HYDG 
sp|P27713[NIFA_,HERSE NIF-SPECIFIC REGULATORY PROTEIN 
sp|P30667|NIFA^AZOBR NIF-SPECIHC REGULATORY PROTEIN 
5 sp|P38035|RTCR_ECOLI TRANSCRIPTIONAL REGULATORY PROTEIN RTCR 
sp|P54929|NIFA_AZOLI NIF-SPECIFIC REGULATORY PROTEIN 
sp|P56266|NIFA_KLEOX NIF-SPECIFIC REGULATORY PROTEIN 
sp|Q06065| ATOC^ECOLI ACETOACETATE METABOLISM REGULATORY 
PROTEIN ATOC (ORNITHINE/ARGININE 
10 sp|Q46802|YGEV^ECOLI HYPOTHETICAL SIGMA-54.DEPENDENT 
TRANSCRIPTIONAL REGULATOR IN 

sp|Q53206|NIFA__RHISN NIF-SPECIFIC REGULATORY PROTEIN 
sp|Q9ZIB7rrYRR.ERWHE TRANSCRIPTIONAL REGULATORY PROTEIN TYRR 

15 Moreover, a number of polypeptides belonging to the .c54 activator family have been 
described whose 3D structures are known. These include: 113161 acetoin catabolism 
regulatory protein; 1 13629 alginate biosynthesis transcriptional regulatory protein ALGB; 
266789 type 4 fimbriae expression regulatory protein PILR; 1 13833 nitrogen fixation 
protein ANFA; 138884 nitrogen fixation protein VNFA; 128219 nif-specific regulatory 

20 protein; 3024194 nif-specific regulatory protein; acetoacetate metabolism regulatory 
protein ATOC 1 168553 (omithine/arginine decarboxylase inhibitor) (ornithine 
decarboxylase antizyme); -417166 transcriptional regulatory protein HYGD; 266622 nif- 
specific regulatory protein; 1352500 nif-specific regulatory protein; 128224 nif-specific 
regulatory protein; 128225 nif-specific regulatory protein; 128221 nif-specific regulatory 

23 protein; 128226. nif-specific regulatory protein; 1346014 Uansciiptional regulatory protein 
FLBD; 549560 hypothetical sigma-54-dependent transcriptional regulator in GUTQ-HYPF 
intcrgcnic region; 139857 transcriptional regulatory protein XYLR (67 kd protein); 120053 
formate hydrogenlyase transcriptional activator, 2507375 hypothetical 49.1 kd protein in 
GLNB-PURl intergenic region (ORFXB) (orf-2); 134961 signal-transduction and 

30 transcriptional-control protein; 1171795 nitrogen assimilation regulatory protein; 417388 
nitrogen regulation protein ru<i); 123466 hydrogenase transcriptional regulatory protein 
HOXA; 399925 hydrogenase transcriptional regulatory protein HOXA; 585586 nitrogen 
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assimilation regulator)' protein NTRX: 1 1 8399 c4-dicarboxylaie transport transcriptional 
regulatory protein DCTD: 585267. paihogenicitv' locus probable regulatory protein HRPR; 
1346313 pathogenicity locus probable regulatory protein HRPS; 549447 pathogenicity 
locus probable regulatory protein WTSA; 585909 arginine utilization regulatory protein 
5 ROCR; 136600 transcriptional regulatory protein TYRR; 1174836 transcriptional 
regulatory protein TYRR homplog: 123748 hydrogenase transcriptional regulatory protein 
HUPRl : 128604 nitrogen regulation protein NTRC; 1 169293 glycerol metabolism operon 
regulatory protein; and 129957 phosphoglycerate transport system transcriptional 
regulatory protein PGTA , The numbers are GenBank gi numbers. 

10 

Preferably, the hybrid a54 activator: is based on the NifA activator. The Nif family of 
bacterial enhancers regulate expression of nitrogenase components from 054 promoters in 
nitrogen-fixing bacteria, and are inhibited by NifL (Austin S, et al (1994) J. Bacteriol 176, 
3460). In bacteria lacking NifL, NifA is constitutively active. NifA is modular in 
15 architecture and it is shown herein that this allows for the swapping of the natural DNA- 
binding domain (DBD) for heterologous DBDs. Such NifA-DBD chimaeras are inactive 
on the wild type promoter, but activate transcription from hybrid promoters bearing their 
cognate target sequences. 

20 Advantageously, the hybrid o54 activator may be based on £. coli PspF (see Jovanovic et 
aU (1996) J. Bacteriol. 178:1936-1945). PspF lacks the N-terminal regulatory domain 
typical of a54 activators, and is constitutively active but negatively, regulated by PspA. 
Thus, in bacteria lacking PspA, PspF is constitutively active. 

25 Other a54 activators may be rendered constitutively active. by removal of the N-terminal 
regulatory domain or by appropriate mutation. 

Nucleic acid binding sequences or domains are known in the art and may be derived from 
<t54 activator proteins or any other DNA binding proteins, whether naturally-occunring or 
30 synthetic. Moreover^ DNA-binding domains may be synthesised by partial or complete 
randomisation. Many naturally-occurring DNA-binding proteins contain independently 
folded domains for the recognition of DNA. and these domains in turn belong to a large 
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number of stniciural families, such as the leucine zipper, the homeodomain, the "helix- 
tum-helix'\ the zinc finger and various other transcription factor families. 

G: Libraries 

5 • . • . ■ 

The term library refers to a mixture of heterogeneous polypeptides or nucleic acids. The 
library is composed of members, which have a unique polypeptide or nucleic acid 
sequence. To this extent library is synonymous wlh repertoire^ although in general the 
term **library" is used herein to denote the source of the repertoire - e.g. a library of 

10 nucleic acid molecules which encodes a repertoire of polyp»eptides. Sequence differences 
between librar>' members are responsible for the diversity present in the library. The 
library may take the form of a simple mixture of polypeptides or nucleic acids, or niay be 
in the form organisms or cells, for example bacteria, viruses, animal or plant ceils and the 
like, transformed with a library of nucleic acids. Advantageously^ the nucleic acids are 

IS incorporated into expression vectors* iii order to allow expression of the polypeptides 
encoded by the nucleic acids. In a preferred aspect, therefore, a library may take the form 
of a population of host organisms; each organism containing one or more copies of an 
expression vector containing a single member of the library in nucleic acid form which 
can be expressed to produce its corresponding polypeptide member. Thus, the population 

20 of host organisms has the potential to encode a Ia(rge repertoire of genetically diverse 
polypeptide variants! 

Libraries of hybrid proteins may be prepared and selected together with libraries of hybrid 
. nucleic acids. "Crossing" of hybrid libraries is performed by combinatorial infection, 
25 which has been employed successfully to generate very large antibody libraries (Griffiths 
ctal(1994)EMBO J. 13.3245). 

Although libraries for use in the present invention may be phage libraries, as is known in 
the art, it is possible to use alternative libraries which are constructed using other vectors, 
30 such as plasmids. In any case, the present invention does not require the library to be 
capable of "display" of the gene product at the bacterial surface, as writh phage libraries; 
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rathen the gene product is preferable expressed intraceilularly. and is advantageously not 
expressed as a Tusion with a vector gene product 

ONA binding domain libraries are preferably based on a known DNA binding domain 
5 architectures (e.g. basic leucine zipper^ bZIP) and may be derived using PCR 
arnplification with "family-specific** primers. Such libraries may be crossed with hybrid- 
promoters bearing defined target sequences or libraries of target sequences. In addition to 
providing information on the distribution of members of the family in a given genome, 
such libraries may be used to identify and study proteins or molecular compounds that 
10 . modify DNA interaction within, a . family of DNA binding domains, for example Tax 
(from HTLV- 1 ) in the case of bZIP proteins. 

In an alternative embodiment, they may also be used to select DNA binding domains 
which conditionally bind their target sequence only in the presence of other fiictors such 
.15 as protein, cofactors or small molecular compounds, for example drugs that intercalate 
into DNA or alter the degree of supercoiling or recognise DNA sequences which have 
been modified chemically (e.g. methylated). The system can also be used '^in reverse" i.e. 
to select proteins or molecular compounds that disrupt a particular DNA-protein 
interaction or to select DNA binding domains that do not bind a particular target sequence 
20 or library thereof. . 

More advanced libraries are preferably derived directly from genomic DNA or cDNA 
libraries and selected on; hybrid promoters bearing a repertoire of target sequences, 
comprising either a stretch of randomised sequence or a library of inserts derived from 
25 fragmented genomic DNA. Data obtained in this way allows the compilation of a genomic 
directory of DNA binding domains and the building of a promoter-DNA binding domain 
interaction map. 

D: Hybrid polypeptides 

30 

The generation of hybrid polypeptides by domain fusion is well known in the art and may 
be effected by fusing polypeptides or, preferably, by fusing nucleic acids which encode 
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the polypeptides. It has been known since 1976 that DN A binding and transcriptional 
activator domains are separable, and can be su'apped be^veen proteins; see Ma and 
Ptashne, who reponed (Cell, (1987) 51. 1 13-1 19; Cell. (1988)55, 443-446) that when both 
the GAL4 N-tenninal domain and C-terminal domain are fused together in the same 
protein, transcriptional activity is induced. Other proteins are also known function as 
UMScriptional activators via the same mechanism. For example, the GCN4 protein of 
Saccharomyces cerevisiae as reported by Hope and Stnihl. Cell. 46, 885-894 (1986), the 
AORl protein of Saccharomyces cerevisiae as reponed by Thukral et al., Molecular and 
Cellular Biology, 9, 2360-2369, (1989) and the human estrogen receptor, as discussed by 
Kumar et al.. Cell, 51, 941-951 (1987) both contain separable domains for ONA binding 
and for maximal transcriptional activation. 

The same is specifically known of the a54 bacterial transcriptional activators, although a 
genetic screen based thereon has not been proposed. Therefore, the present invention may 
be carried out using techniques which are known to those skilled in the art, particularly as 
applied to 2-hybrid techniques in eukaryotic cells. 

Syndiesis of chimeric genes for the purposes of the present invention may be carried out 
by any desired means, including polynucleotide synthesis and mutagenesis approaches. 
For example, a. number of methods for site-directed mutagenesis are known in the art, 
from methods employing single-stranded phage such as M13 to PCR-based techniques 
(see "PCR Protocols: A guide to methods and applications", M.A. Innis, D.H, Gel&nd, 
JJ- Sninsky, T.J. White (cds.). Academic Press, New York, 1990). Preferably, the 
commercially available Altered Site II Mutagenesis System (Promega) may be employed, 
according to the directions given by the manufacturer. 

E: Host Celk 

Host cells useful in conjunction with the present invention are prokaryotic cells, 
advantageously bacterial cells. K coli is the preferred host: however, host cells may 
belong to any species or genus in which a54 RNAP-driven transcription is possible, such 
as Klebsiella, Rhodobacier, Rhizobium, Acinetobaaer. Pseudomonas, Escherichia, 
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Bacillus, Caulobacier Vibrio, Rhodospirillum. Alcaligenes, Salmonella, Myxococcus, 
Xylella. Azospiriilum, Aquifex. Agrobaaerium and other organisms. In Exoli, the 
preferred configuration is a modified strain, in which a truncated fonn of Nif (or another 
activator) is coexpressed to boost specific activation (see Methods). 
5 . 

. Preferably, the host cells lack repressors of the a54 activator being used, such that the 
. transcription activating domain is constitutively active. Repressors may be deleted by 
genetic mutation and/or selection* or inhibited by expression of antisense constructs, or 
the like. In general, due to the accessibility of bacterial genetics, especially in £ co/r, 
10 deletion of repressor genes is straightforward to those skilled in the art. 

F: Reporter Genes 

. Reporter genes of various types are known in the art and may be used in conjunction with 
15 the present invention. A "reporter gene*\ as referred to herein, may be the coding 
sequence which encodes a detectable gene product, or the coding sequence including the 
necessary control sequences for its expression in accordance with the invention,, as 
appropriate. 

20 Advantageously, the reporter gene is selected from, the group consisting of metabolic 
markers such as the lac operon (tacZ, lacY and lacA); proteins conferring a fluorescent 
phenotype, such as. GFP; proteins conferring antibiotic resistance, such as Zeo; and 
. proteins conferring another selectable property. 

25 Certain reporters, such as the LacZ gene, are widely used in bacterial genetics and are 
useful in the performance of the invention. However, other genes may also be employed, 
including fluorescent proteins. For example, green fluorescent proteins (GFPs) of 
cnidarians, which act as their energy*transfer acceptors in bioluniinescence, can be used in 
the invention. A green fluorescent protein, as used herein, is a protein that fluoresces 

30 green light, and a blue fluorescent protein is a protein that fluoresces blue light GFPs 
have been isolated from the Pacific Northwest jellyfish, Aequorea victoria, from the sea 
pansy, Renilla reniformis^ and from Phialidium gregarium. (Ward et al., 1982, 
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Photochem. Photobiol .. 35: 803-808: Levine et al., 1982. Coma Biochem, Phvsiol ..72B! 
77-85). 

A variety of Wegworea-relaied GFPs having useful excitation and emission spectra have 
5 been engineered by modifying the amino acid sequence of a naturaliy-occuiring GFP from 
Aeqiiorea victoria. (Prasher et a!., 1992, Gene. HI: 229-233; Heim ei al., 1994, Proc, 
. Natl. Acad. Sci. U.S.A.. 91: 12501-12504; PCTAJS95/14692). As used herein, a 
Quorescent protein is an Aequorea-related fluorescent protein if any contiguous sequence 
of 150 amino acids of the fluorescent protein has at least 85% sequence identity with an 
10 amino acid sequencel either contiguous or non-contiguous, from the wild-type Aequorea 
green fluorescent protein (SwissProt Accession No. P42212). Similarly, the fluorescent 
protein may be related to Renilla or Phialidium wild-type fluorescent proteins using the 
same standards. 

15 Aequprea'tcl^ttd fluorescent proteins incliidei for example, wild-type (native) Aequorea . 
victoria GFP, whose nucleotide and deduced amino acid sequences are presented in 
GenBank Accession Nos. L29345, M62654, M62653 and others Aequorearvdaitd 
engineered versions of Green Fluorescent Protein, of which some are listed above. Several 
of these, i.e., P4, P4-3, W7 and W2 fluoresce at a distinctly shorter wavelength than wild 

20 type. 

A specific advantage of fluorescent proteins is that they facilitate FACS sorting of cells in 
a manner dependent on reporter gene expression (Norman, S.O. (1980). Flow cytometry. 
Med Phys. 7, 609-615; Mackenzie, N.M. & Pindcr, A.C. (1986). The application of flow 
25 microfluorimetry to biomedical research and diagnosis: a review. Dev. BioL Stand, 64, 
181-193). 

Other reporter genes may complement auxotrophic mutations, confer antibiotic resistance 
or other selectable characteristics to the host bacteria. jRepbrter genes may be wholly or 
30 partly heterologous to the host cell, and introduced by mutagenesis and/or transformation 
with appropriate vectors. Alternatively, endogenous a54-responsive genes may be used 
as reponer genes. 
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The reponer gene also contains a binding site for a54 RNAP. The consensus sequence 
for a54 RNAP binding is 5' TGGCAC-N5-TTGCa/l 3'. This sequence is located at -12 to 
-24 with respect to the start of trariscnption, whilst the more common sigma 70 
recognitibh sequence is situated at -10 to/ -35. Both the GO & GC must be on the same 
face oflheDNA helix. 

In order to increase specificity, combinaiions of two or more reporter genes (preferably in 
• tiuidem)'ihay be used. 

Where the reponer gene is chimeric. i.e. comprises heterologous binding sites for the 
nucleic atid binding sequence aiid d54 RNAP binding sites incorporated into the same 
nucleic acid, the spacing between the a54 RNAP binding site and the nucleic acid binding 
sequence binding site is preferably conserved with respect to the natural gene from vAdch 
the a54 RNAP binding site is taken. Advantagebusly, the spacing is at least calculated 
such that the spatial relationship of the elements on respective faces of the nucleic acid 
helix is maintained. 

Reporter genes advantageously comprise a binding site for a further activation factor, such 
as IHF. These factors are believed to induce bending of the DNA, thus potentiating 
activation of a54 RNAP-driven transcription by o54 activators. Alternatively, the DNA 
itself may be intrinsically bent, thus providing constitutive potentiation of c54-specific 
activation. 

G: Configurations of the Invention 

The present invention may be configured in three basic ways: a first configuration, in 
which reporter gene activation is dependent on the interaction between the nucleic acid 
and a nucleic acid binding domain on the hybrid protein; a second configuration, in which 
reporter gene activation is dependent on interaction between bait and prey polypeptides 
which serves to bring together two or more components of the hybrid protein; and a third 
configuration, in which reporter gene activation by a a54 activator is dependent on the 
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presence of DNA-bending polypeptides. As referred to herein, an interaction is 
advantageously a biriding interaction. 

. Where the invention is configured to detect protein-nucleic acid interaction, libraries of 
5 proteins and/or nucleic acids may be prepared as described above. Proteins having 
improved nucleic acid binding, or nucleic acid sequences having improved affinity for 
protein domains, may be developed by mutagenesis and selection of candidate sequences. 
Alternatively, protein and/or nucleic acid sequence^ may be used to identify, in vivo, 
cognate binding partners. 

10 

Chimeric o54 activators also offer the, opportunity to bener understand aspects of the 
process of transcriptional activation at Q34 promoters. In tlie case of NifA, t is known that 
binding of the target sequence together with ATP binding promotes oHgomerisation of 
NifA. It is believed that it is the oligomer which contacts the polymerase and catalyses the 

15 ATP-driyen isomerisation of the polymerase ""holoenzyiiie. Taking advantage of the 
superactivation effect described above it may be possible to address questions such as 
which components of the oligomer (e.g. the DNA-bound NifA vs. NifAAC. i.e. hfifA with 
the DNA binding domain removed), which are contacting the polymerase and/or coupling 
ATP hydrolysis, to transcriptional activation etc. Furthermore, usage of Ni^AC cofactors 

20 from different species (together with their diversification by PCR shuffling) allows 
identification of the sequence regions critical for transcriptional activation and a 
"maturation- of the NifAAC coactivator. Indeed, we have found the NifA from K 
pneumoniae to be a superior cofactor to A. vinelandii NifA. Finally, it may be possible to 
use chimera of a known DNA binding domains (e.g. GCN4) and a cONA library as a 

25 prokaiyotic "enhancer" trap, to isolate o54 activators on a genome-y«de scale. 

Configuration of the invention to detect protein-protein interactions follows the general 
scheme of the yeast two-hybrid assay, and the reagents used in the invention may be set 
up accordingly. In general* therefore, the invention will comprise a nucleic acid binding 
30 domain-bait fusion, and a prey-a54 activator domain fusion. Although, in general, **bait" 
refers to a known polypeptide and ''prey" to an unknown polypeptide, the terms may be 



wo 01/18244 



PCT/GBOO/03450 



21 

used interchangeably. Indeed, the invention comprises configurations in which both bait 
and prey are known* or both are unkno\>m. 

Binding between the bait and the prey result in constitution of a hybrid protein which 
comprises both a nucleic acid binding domain and a a54 activator domain. The hybrid 
protein is able to activate transcription from a reporter gene, thus providing a bait:prey 
binding-dependent signal. 

Protein-protein interactions may be selected using the preferred NifA system, in which the 
hybrid o54 transcriptional activator includes the NifA activation donuun. The NifA 
bacterial two-hybrid system may be used for the generation of interaction matrices between 
cDN.A' libraries. Ultimately such interaction matrices may yield an interaction map of the 
proteins of an organism. The invention provides an alternative to the yeast two hybrid 
system. 

Systems based on a54 have a number of advantages over the other systems that are 
available, e.g. the conceptually similar yeast one and nvo-hybrid system. A bacterial host 
allows substantially larger repertoires to be obtained and thus a much larger molecular 
diversity to be scrieened. In particular, using combinatorial infection, the system of the 
invention allows the "crossing" of both o54-chimera repertoires with libraries of hybrid 
reporter constructs, thus penhitting coevolution of DNA binding domains, and recognition 
sites; or coselection of DNA binding domains and target sites from genomic libraries. 

Because selection in the a34*based system is based on a positive readout, i.e. activation of 
transcription, it is less prone to false positives than other approaches relying on the 
inhibitory effect of the expressed DNA binding domains, like the transcription interference 
assay (Elledge SJ. et al (1989) Proc. Nat: Acad Set USA 86. 3689). In vivo selection in 
general may result in the selection of novel DNA binding domains that are more attuned to 
working under realistic conditions, including supercoiling of the recognition site, presence 
of a large excess of chromosomal DNA and high protein concentration. Another 
advantage of the system of the invention is that extremely low levels of the hybrid protein 
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appear to be sufficient to affect maximum activation of transcription. This is particularly 
helpful in the case of DNA binding domains that are prone to aggregation. 

The a54-based systems of the present invention may be funher adapted to take into 
.account potential disadvantages of bacterial, expression. For instance, E. coli expression 
may be suboptimal for large eukar\'otic transcription factors. However, large eukaiyotic 
proteins can often be split into smaller domains which retain function and are usually 
readily expressed in E. coli. 

According to the third configuration, a constitutively active a54 activator may be used to 
screen a library of candidate. DNA-bending fwlypepiides. preterably in a HIF negative host. 
Since the degree of activation by the a54 activator may be dependent on DNA bending by 
additional factors, the levels of expression of the reporter gene will be modulated by the 
DNA-bending activity of the candidate DNA*bending polypeptides. 

The invention is further described, for. the purposes of illustration only, in the following 
examples. 

Examples : 

NifA from A. vinelandii is a well-smdied member of the fainily of bacterial enhancers and 
it is a positive regulator of the expression of nitrogenase components in diazotrophs. It is 
inhibited by NifL in response to the presence of o.xygen or ammonia. When expressed in E. 
coli, which lacks endogenous NifL or an equivalent, NifA is constitutively active. Because 
of the highly conserved nature of the activation mechanism of a54 RNA polymerase, NifA 
is a very strong activator of transcription in £. colL 

Like other members of the family of bacterial enhancer proteins, NifA is modular in 
architecture, both structurally and functionally, comprising 3 domains, a N-terminal sensor 
domain , a central activation domain (AD), and a C-terminal DNA binding domain (DBD). 
The central activation doinain (AD) can activate transcription independent of DNA 
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binding if overexpressed. Thus the DBDs function appears lo be primarily to increase the 
Activator domain s concentration in the promoter proximity. 

We have exploited the modularity of the enhancer structure and swapped the natural NifA 
5 DNA binding domain (DBD) for heterologous DBDs and libraries thereof. Here we 
describe the activity of these.NifA-chimeras in the activation of transcription from the a54 
dependent promoter nifH and hybrids thereof. 

Materials & Methods 

10 

Media & Reagents- 

. 2xTy,. MacConkey agar are described elsewhere (Miller J.H. (1972) Experiments in 
molecular genetics. Cold Spring Harbour^ NY). Antibiotics were used at the following 
concentrations: Ampicillin 0.1 mg/ml. Chloramphenicol lOfig/mU Streptomycin 25|ig/ml. 

15 Min-lac medium was essentially M9 medium (Sambrook et al.j Molecular Cloning: A 
Laboratory Manual, 2d ed. (1989) Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor. N.Y) supplemented with ImM MgS04, 20nM CaCh, 2% (w/v) lactose, 2mg/ml 
casamino acids, 40^g/mi L-tryptOpban, 3^g/ml thiamine and appropriate antibiotics. Min- 
lacX plates where essentially M9 plates supplemented 2% lactose, appropriate antibiotics 

20 and 40fig/ml X-gal (S-bromo-4-chloro-3-indolyi-b*D-galactopyranoside). 

Strains 

TGlZiK was derived from TGI (Gibson T. J. (1984) Studies on the Epstein-Barr virus 
genome. University of Cambridge) using the genome mtegration strategy of Haldimann A. 

25 et aL, (1996) Proc. Nat. Acad Sci USA 93, 14361. Briefly, NifA (K. pneumoniae) 
residues 1-462 was amplified using Pfu polymerase (Stratagene) and primers 1 (S*- GAG 
TCA CTA ACG CAT ATG ATC CAT AAA TCC GAT TCG GAC -3'), 2 (5'- CGC GGA 
TCC AAG CGG CCG CTC ATT AGC GAT GGT TGA ACA GAA TCA C -3') cut with 
Ndel and BamHI and cloned into the genome targeting suicide vector pSKS0D-uidA2 

30 (Haldimann, Op, C/Y.) and transformed into the Pir**" host strain B W23473 (Metcalf W.W. 
etal (1994) Plasmid 33, 1). Vectors were isolated and transfonned into the Fir strain TGI 
harbouring the plasmid pINT-ts (Hasan N. et al (1994) Gene 150. 51). Chromosonuil 
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integration was induced by a temperature shift to 42^C, which leads to expression of X 
integrase from pINT-ts and simultaneously stops its replication. Integrants where identified 
by Kanamycin resistance and screened for Nif coactivation. Once obtained TGIAK was 
grown routinely without antibiotic selection. 

5 

Constructs 

Chimeric constructs were based on pDB737 (Austin S. ei al (1994) J. BacterioL 176, 3460 
Buck M. el al (1986) Nature 320, 374) encoding NifA {A, vinelandii) under the control of. 

10 the T7 promoter in the plasmid pT7-7 (Tabor S. & Richardson C.C. (1985) Proc Natl 
AcadSci USA 82, 1074). E.\pression was by leakiness of the T7 promoter. Chimeras were 
constructed taking advanuige of an unique Banll cuning site, in the linker region between 
the central domain of N'ifA and the DBD. i3CN4 was amplified using Pfu polymerase 
(Suatagene) and primers 3 (5'- GCT GCC AGC GAG AGC CCG CCG CTC GCC GCG 

15 ATTGTGCCCGAATCC AGTGATCCT-3') and 4 (5'- GAG CTA AAG CTT TTA 
TTA GCG TTC GCC AAC TAA TTT CTT T AA TCT GGC -3*) cut with BanH and 
Hind3 and iigated into pDB737 cut with Banll and Hind3. ERDBD was amplified using 
primers 5 (5'- GTC GAC AAC GAG AGC CCG CCG CTC GCC GCQ GAA ACG CGT 
TAG TGC GCT GTT -3') TGC fflid 6 (5'- GGT CAG CGC GTG GAT CCT TAA CCA 

20 CCA CGA CGG TCT TTA CGO*) cut with Banll and BamHI and Iigated into pDB737 
cut with Banll and BamHI. The vector p737Sl is derived from pDB737 by replacing the 
bla gene with aadA conferring streptomycin resistance and the insertion of a fl phage 
origin for packaging of the vector into filamentous phage particles. Briefly, aadA was 
amplified using primers 7 (5*- TCA GCG CAC GCT GAC GTC GTG GAA ACG GAT 

25 GAA GGC ACG AAC -3 ), 8 (5'-CCG CCT GGA GCjT GGC CAT TAT TTG CCG ACT 
ACC TTG GTG ATC TCG CC -3') and cut with Aatll and MscI and Iigated with 
pDB737 cut with Aatll and Seal. The resulting vector p737S was cut with Aatll, Clal. The 
fl on was amplified using primers 9 (5'- GCT GCC GAC TCG ATC GAT GAA TGG 
CGA ATG GCG CCT GAT GCG G -3'), 10 (5*-CCG GGT CGT GAC GTC AGT GTT 

30 GGC GGG TGT CGG GGC TGG C -3') cut with AaUI. Clal and cloned into the cut 
p737S to give p737Sl. NifA-X chimera were u^ansferred from pDB737 to p737Sl by 
digestion with Ndel , Hind3 (BamHI for NifA-ERDBD). 
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Reporter consirucis were derived from pACYC184 and the vector pMBl (Buck M. e/ a/. 
. (1986) Nature 320, 374). Briefly the lac-operon (lacZYA) was amplified with primers 1 1 
(5^ GAG TCA ATT CGG GGA TCC CGT CGT TTT ACA ACG TCG TGA CTG G-3:), 
5 12 (5'- GAG TCA TTC TGG CCA GTC GAC CGC TCT GCC GGT GGT TAC -3 ) and 
cut with BamHI and Mscl. The nifH promoter segment from pMBl was amplified with 
primers 13 (5'- GAG TCA TTC AAG CTT GCG TGG AAT AAG ACA CAG GGG 
GCG-3'), 14 (5'. GAG TCA TTC GGG ATC CCC GGA TTT ACC GAT ACC GCC 
TTT ACC -3') and cut with Hind3, BamHI and the 2 fraginents simultaneously ligated 
10 with pACYC184 cut with Hind3 and BsaAI to give pMB3. The fl ori was amplified with 
primers 15 (5'. GCT GCC GAC TCG GCT AGC CAA TGG CGA ATG GCG CCT GAT 
. GCG G -3 ), 16 (GCC GGG TCG CTT TAA AGT GTT GGC GGG TGT CGG GGC 
TGG C -3') and cut with Nhel and Dral and ligated into pMB3 cut with both Nhel, Xmnl 
togivepMB3I. 

15 

Selection and screening 

Cells were cotransformed either by simultaneous or sequential electroporation with an 
expressor construct and a reporter . construct and grown overnight with appropriate 
antibiotic selection at 340C in M9-lac medium and plated out p-gal expression was scoiwl 
20 either on MacConkey or Minlac-X-gal indicator plates or by ONPG enzyme assay of 
selected colonies (see below): 

Enzyme assay 

ONPG assays used to measure p-gal activity were essentially as described by Kobnar H. 

25 et al. (1995) EMBO J 14, 3895. Briefly, 20^1 of an overnight culture is tr^fened to a 
microtitre well and 100^1 of chloroforai saturated Z-buffer (lOOmM NaHP04, ImM KCL, 
ImM MgS04» 50mM P-mercaptoethanol, pH 7.0 (Miller J.H. (1972) Experiments in 
molecular genetics. Cold Spring Harbour, NY) was added and the optical density at 600nm 
determined using an ELISA reader. Cells were lysed by addition of 50^1 Z-buffer vdth 

30 0.4% (w/v) SDS and incubated at 30oC for 10 min. 50^1 of Z-buffer with 4mg/ml O- 
nitropheniyl-P-D-galactopyranoside were added and the optical density at 420nM was 
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recorded automatically every 1 5s over a period of 60min. Specific p-gal activit>' was 
calculated from the Vm^s as in Miller (Op. Cii.). 

5 Example 1: NifA chimera until heterologous DNA binding domains activate 
transcription but only from promoters with a cognate recognition site 

To investigate in what way transcription activation by NifA was. dependent on the NifA 
DNA binding domain (D.BD) and on native nif promoter structure^ we prepared NifA- 

10 chuneras in which NifA DNA binding domain (DBD) had been replaced by heterologous 
DBDs of diverse structural architectures. Initially we explored DBDs which, like the NifA 
wild type (wt) DBD bind to symmetrical DNA recognition sequences such as the basic 
leucine zipper (bZIP) DBD of the yeast transcription factor GGN4, the Zn-finger domain 
of the human estrogen receptor DNA binding domain (ERDBD) and determined their 

15 capacity to activate transcription of a lacZ reporter gene in vivo from a hybrid nifH 
promoter, in which the NifA UAS had been deleted and replaced by recognition sites for 
the heterologous DBDs. 

In order to simplif>' comparison of transcription activation by NifA chimeras with 
20 activation by. wt NifA, all reporter constructs had. a single DNA. recognition site. The wt 
nifH promoter UAS contains three bona fide NifA recognition sites. Deletion of the two 
sites more distal to the promoter, however, did not appear to reduce transcription 
activation in our reporter under conditions tested. 

25 Transcription activation by NifA-chimeras was specific in that they only activated lacZ 
expression from hybrid-promoters bearing their cognate recognition sequences but not 
from control reporter constructs bearing wild type UAS or a non-cognate site (Fig. 3): In 
analogy, to wt NifA the presence of two or more recognition sites (in phase, sec below) did 
not increase activation by the Nif-GCN4 chimera. 

30 

Activity was also dependent on the phasing of the recognition site with respect to the 
promoter when the symmetric ATF/CREB recognition site for GCN4 was offset in 
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incremenis of 1 bp, optimal activity xsiis observed when the ATF/CREB was centred on 
the same bp as the symmetric wt UAS. Presumably efficient contact with the RNA Pol 
holoenzyme requires that the activator be bound on the right face of the DNA. 

5 Transcription activation by NifA-chimeras appears to preserve fine specificity of isolated 
DBDs, Wild-type GCN4 binds with equal aflinity to the symmetric ATF/CREB site, as 
well as to the pseudo-symmetric AP-1 site. Indeed, the NifA-GCN4 chimera showed 
identical levels of transcription activation in reporter constructs with either of these sites 
(Fig. 3): A Ni£A-ERDBD chimera showed strong activity on a reporter with its cognate 
10 ERE site but no activity above backgroimd levels with reporters bearing the similar GRE 
recognition site for the closely related glucocbrticoid receptor DBD (Fig. 4). 

Example!: Coexpressidn of - wild-type NifA with NifA-chimeras boosts specific 
transcription activation by NifA chimeras in a. specific and DNA independent manner 
15 ' 

The level of transcription activation by the NifA-GCN4 and NiCA-ERDBD chimeras was 
lower (ca. 10%) than for wt NifA. However, near wt levels of activity (up to .80%) were 
reached when wt NifA was coexpressed within the same cell as a "coactivator**. 

20 The coactivation was independent of DNA binding, as NifA variants in which the DBD. 
had been deleted (NifAAC) was found to be just as active as wt NifA. On the other 
coexpression of an isolated NifA central domain (both the DBD as well as the N-terminal 
sensor domain deleted (NifAANC)) failed to coactivate. NifA derived firom different 
species showed greatly variable efficiencies as coactivators. NifA variants fi'om K 

25 pneumoniae (NifA Kp, NifAAC Kp) were almost three times as effective as NifA, while 
Ni^ variants firom Rhizobium (NifA Rhl, NifA Rh2) were poorly active as coactivators 
(Fig. 5). 

The coactivator effect was found to enhance only specific transcriptional activation and not 
30 backgroimd levels of transcription from promoters with non-cognate recognition sites. We 
therefore constructed an £ coli strain, expressing NifAAC Kp (the pneumoniae NifA 
with its DBD deleted) from a weak promoter (phoB) from the chromosome (TG1:AK). 
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The coaciivaiion effect has analogies in eukar>oiic transcripiion. for example ihe enhancer 
Spl, in which isolated Spl aciivaiion domains can stimulate transcriptional activation by 
the DNA binding-fomi of Spl . a phenomenon termed "superaciivation". 

■ 5 ... • 

Example 3: Tethering of NifA chimera at the UAS is sufficient for activation, but strong 
activation requires correct positioning- 

10 We also investigated transcription activation by NifA-chimeras with asymmemcal 
recogniuoa sites such as the classic Zn-finger Zif268 as well as the DBD from p53. 

Both NifA-Zi068 and NifA-p53 chimeras activated transcription, but only at low levels (2 
- 5-fold above the background). However, when the Zif recognition site was duplicated, to 
13 give a synunetric palindromic site transcription activation increased substantially. Non- 
palindromic duplication of the recognition site in tandem did not increase activation. 

Thus while simple tethering is sufficient for some activation, only bipartite binding 
appears to give a strong activation. Presumably, tethering only leads to an approximate 
20 positioning of the activation domain with respect to the RNA polymerase holoenzyme, 
thereby reducing the likelihood of a productive interaction. 

. .Example 4: Selection of active NifA-chimeras by lac complemenmtion 

: 25 

Using expression of the lac operon (lacZYA) from our reporter consuiict as the read-out 
of transcription activation allows the selection of active NifA-chimera on the basis of 
metabolic complementation of a Alac strain, with lactose as the only carbon source. 
Initially we spiked populations of NifA-ERDBD with NifA-GCN4 at the ratios 1/1 0"^. 
30 I/IO^ in the presence of the GCN4 co^ate reporter ATF/CREB-nifH and grew 
populations overnight in minimal medium supplied with lactose. Pre- and post selection 
populations were scored by plating on MacConkey-Iactose plates as well as by PGR 
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screening. The results are summarised in Table 1. Selection factors of up io 10,000 -fold 
per round were obsen^ed. . . 



Table l:Selection factors for Nif selection by lac complementation 



5 



NifGCN4/NifERDBD 



Selection factor 



1/10^ 
1/10^ 
1/10^ 
1/10^ 



40 fold 
40 fold 
200 fold 



4000 fold 



Example 5: Selection of active NifA-chimeras by flow cytometry. 

10 Expression of P-galactosidase (lacZ) as the read-out of transcription activation allows the 
selection of active NifA-chimera on the basis of metabolic complementation of a Aiac 
strain, grown on lactose as the only carbon source. However, metabolic selection 
predisposes the system to the generation of false positives; Presumably, the prolonged 
growth under metabolic selection selects for mutant promoters, active in the absence of a 

15 cognate enhancer. 

We have observed that that this only occurs for library sizes exceeding 10^ Indeed, others 
have found (using a related bacterial two-hybrid system) that it is not possible to retrieve 
positive clones from dilutions higher than I/IO^ by metabolic lac selection (G. Karimova, 
20 et al., (1998) Proc Natl Acad Sci USA 95, 12532-7). As it is well known that bacteria can 
develop a mutator phenotype under adaptive stress (P. D. Sniegowski, P. J. Gerrish, R. E. 
Lenski, (1997) Nature 387, 703-5), we conclude that it is preferable to separate the 
selection from the amplification (growth) step in order to reduce the likelihood of 
reyertants. 



25 
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We thus replaced lacZ wiih the Aequorca viaoria green fluorescent protein (EGFP. F64L, 
S65T. ex488 nm. em527nm ithe Cloritech variant pGFPmuiJ.I, S65E . S72A, exSOlnm. 
em51 lifim FACS optimised variant was also tried, but found inferior) as the reporter gene, 
GFP has the advantage that cells can be growTi first and then separated on the basis of 
5 fluorescence using fluorescence activated cell sorting (FACS). 

We prepared a trial library of mutant GCN4 bZIP DBDs (librarv- size 10*) in which 5 key 
residues (Asn235, Ala238, Ala239. Ser242, Arg243) oif GCN4 interacting with DNA were 
randomised and selected it against a GFP hybrid reporter with the cognate ATF/CREB 

10 site. Library populations were grown overnight at 34^C in non-fluorescent medium NFM 
(minimal medium supplied with 2%glucose. 0.2% casaminoacids.l2 ng/ml L-Trp). For 
FACS (Cytomation iMofo, 488 nm Laser. FL-1 530/40 filter) an 1 ml aliquot was diluted 
lOX in NFM and the top 1% fluorescent cell population was sorted into a 96 well plate at 
1 cell per well, and grown up overnight at 34^C. Cell fluorescence of the grown up clones 

13 was measured, by using a SPECTRAmax'^GEMINI Duai-Scannihg Microplate 
Spectrofluorometer (Molecular Devices), ex480, em520, (cut-off 515 run). Plasmids from 
. fluorescent wells were sequenced afterwards. Pre- and ppst selection populations were 
also scored by PGR screening as well as by plating on min glu (M9 Minimal medium + 
.glucose) plates and visualised using fluorescence microscope. 

20 

10^ cells were sorted in total, from which 219 cells were in the top 1% fluorescent 
population and 132 of which were capttu^d to the 96- well plates. 1 3 cells from these were 
fluorescent Selected positives were checked by separating the mutant GCN4-bZ[P DBO 
expressor plasmids. and re-transforming them together with cognate and non-cognate 
25 reporter plasmids. None of the selected positives gave a fluorescent signal when 
. combined non-cognate reporter plasmids, but all were fluorescent when combined with 
the ATF/Creb cognate reporter plasmid (which did not produce any fluorescence when 
transformed on its own). 

30 This indicates that GFP selection indeed avoids the isolation of false positives. 
Furthermore, when the library >yas checked prior to FACS sorting no fluorescent clones . 
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were identified when plating > 10' ceils. I/IO clones plated post selection were 
fluorescent, suggesting a selection factor in a single round in excess of lO^-fold. 

All publications mentioned in the above specification are herein incorporated by 
5 reference. All database sequences denoted by accession or gi numbers are likewise 
incorporated by reference. 

Various modifications and variations of the described methods and system of the 
invention will be apparent to thbse skilled in the art without departing from the scope and 

10 spirit of the invention. Although the invention has been described in connection with 
specific preferred embodiments, it should be understood that the invention as claimed 
should not be unduly limited to such specific embodiments. Indeed, various modifications 
of the described modes for carrying out the invention which are obvious to those skilled in 
molecular biology or related fields are intended to be within the scope of the followmg 

15 claims. 
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Claims 

1. A method for detecting a protein-nucleic acid interaction between a acid molecule 
and a protein molecule, comprising the steps of: 
5 a) providing one or more hybrid a54 activator proteins comprising a heterologous 

nucleic acid binding sequence and a constitutively active a54 transcription activating 
domain; 

b) providing one or more nucleic acid molecules comprising a binding site for. the 
nucleic acid binding sequence and. a binding site for a54 RNAP, v^hich directs the 

10 expression of a . reporter gene and leads to upregulation thereof in response to activation by 
the a54 transcription activating domain; and 

c) detecting expression of the reporter gene. . 

2- A method according to claim 1, comprising providing a repertoire of hybrid cr54 
15 activator proteins, said repertoire comprising a plurality of different liucleic acid binding 
sequences. 

3. A method according to claim 1 , comprising providing a repertoire of hybrid nucleic 
acid molecules, said repertoire comprising a plurality of different binding sites for the 

20 nucleic acid binding sequence. 

4. A method according to claim 1, comprising providing both a repenoire according 
to claim 2 and a repertoire according to claim 3. 

25 5. A method for detecting a protein-protein interaction, comprising the steps of: 

a) providing a first hybrid protein comprising a nucleic acid binding sequence and a 
first polypeptide sequence bait; 

b) providing a second hybrid protein comprising a prey polypeptide sequence and 
constitutively active a54 transcription activating domain; 

30 c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 

binding sequence and binding site for a54 RNAP which directs the expression of a 
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reporter gene and leads to upregulaiion thereof in response to aciivaiion by the a54 
transcription activating domain: 

d) incubating the first and second hybrid proteins together with the nucleic acid 
molecule such that the prey and bait pol>peptide sequences may bind, thereby forming a 
hybrid protein comprising both a nucleic acid binding sequence and a a54 transcription 
activating domain; and 

e) detecting expression of the reporter gene. 

6- A method according to claim 5. comprising providing a repertoire of first hybrid 
proteins, said rejpertoire comprising a plurality of bait polypeptides. 

7. A method according to claim 5, comprising providing a repertoire of second hybrid 
proteins, said repertoire comprising a plurality of prey polypeptides. 

8. A method according to claim 5, comprising providing a repertoire of first hybrid 
proteins and a repertoire of second hybrid proteins, said repertoires comprising a plurality 
of bait and prey polypeptides. 

9. A method for screening a repertoire of candidate DNA-bending 
polypeptides, comprising the steps of: 

a) providing a repertoire of candidate polypeptide factors with potential to induce 
.bending of DNA;. 

b) providing a a54 activator protein comprising a nucleic acid binding sequence 
and a a54 transcription activating domain; 

c) providing a nucleic acid molecule comprising a binding site for the nucleic acid 
binding sequence and binding site for a54 RNAP which directs the expression of a 
reporter gene and leads to upregulation thereof in response to activation by the cj54 
transcription activating domain; 

d) incubating the repertoire and 054 activator together with the nucleic acid 
molecule in a HIF* host cell, such that ct54 activator and the nucleic acid molecule may 
interact, and transcription activated from the a54 RNAP binding site in a manner 
dependent on DNA bending by the polypeptide factors; and 
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e) detecting expression of the reporter gene. 

10. A method according to any preceding claim, wherein.the poh'peptides are obtained 
by expression within a bacterial host cell. 

• 5 • . ■ . . . V. 

11. A method according to claim 10. wherein the pohpeptides are encoded one or 
more libraries of nucleic acid vectors. ^ 

12. A method according to claim 1 K. wherein a. firsijibrary of nucleic acid vectors 
10 . encodes a first chimeric gene, said gene comprising a nucleic acid sequence that encodes a 

nucleic-binding domain and a nucleic acid sequence encoding a first (bait) test protein or 
protein fragment in such a maimer that the first test protein is expressed as part of a hybrid 
protein with the nucleic acid-binding domain. 

15 13. A method according to claim 11, wherein a second library of nucleic acid vectors 
encodes a second chimeric gene, said gene comprising a nucleic acid sequence that 
encodes a <j54 transcriptional activation domain and a second (prey) test protein or protein 
fragment into the vector, in such a manner that the second test protein is capable of being 
expressed as part of a hybrid protein vnih the transcriptional activation domain. 

20 

14. A method according to. any preceding claim, wherein the a54 transcriptional 
activator is selected from the group consisting of: 

dbj(BAA16379.1| (D90877) FORMATE HYDROGENLYASE TRANSCRIPTIONAL 
ACmVATOR; 

25 emb|CAA26472.1| (X02616) pot. Nifa gene product (aa 1-484) (Klebsiella pneumoniae]; 

emb|CAA53584. 1| (X75972) anfa [Rhodobacter capsulatus]; 

emb|CAA92413.1| (Z68203) nifa homologue [Rhizobium sp.]; 

emb|CAA93242.1| (Z6925I) mopr [Acinetobacter calcoaceticus]; 

emb|CAB53 157.11 (X07567) nifal [Rhodobacter capsulatus]; 
30 emb|CAB56537. 1 1 (AJ249642) response regulator [Pseudomonas stuizeri]; 

gb|AAA58220.I| (U18997) ORF^o532 [Escherichia coli]; 

gb|AAA99303.1|(L43064) regulatory protein [Pseudomonas aeruginosa]: 
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gb|AAB9 1397.1 1 (AF033203) nifaii protein (Rhodobacter capsulatus]; 
gb|AAC05586.1| (AF006075) regulatory protein [Bacillus subiilis]: 
gb|AAC37I24.i| (LSI 176) fleq (Pseudomonas aeruginosa]; 

gb|AAC45640. 1 1 (AFO 1 0585) putative sigma 54 activator [Caulobacter crescennis]; . 
5 gb|AAC46367.1| (AF014113) two-component response regulator (Vibrio cholerae]; 
gb|AAD3459l.l|AFI45956_l (AFl 45956) transcriptional activator nifa [Rhodospirillum 
rubrum]; 

gb|AAD38416.1| (AF155934) nifa [Alcaligenes faecalis]; 
^ gb|AAF28395.1| (AF069392) flam [Vibrio parahaemolyticus]; 
10 gb|AAF33506,l|'(AFl 70176) Salmonella typhimurium transcriptional regulatory protein; 
gb|AAF6 1932.1 1 (AF230804)sigma-54 activator protein Act 1 [Myxococcusxanthusl; . . 
gb|AAF85342.1|AE004061_7 (AE004061) two-component system, regulatory protein 
[Xylellafastidiosa]; 

gb|AAF94676.1| (AE004230) sigma-54 dependent response regulator [Vibrio cholerae]; 
1 5 gb|AAF95280; 1 1 (AE004286) sigma-54 depeiident response regulator [Vibrio cholerae]; 
gb|AAF96095.1| (AE004358) sigma-54 dependent transcriptional regulator (Vibrio 
cholerae]; 

gb|AAG01527.1|AF288483_l (AF288483) nifa [Azospirillum brasilense]; 

pir||A48291 ornithine decarboxylase inhibitor - Escherichia coli; 
20 pirI|B49940 nitrogen regulator I homolog - Escherichia coli: 

pir(|C70320 transcription regulator nifa family - Aquifex aeolicus; 

pir||C70396 transcription regulator ntrc family - Aquifex aeolicus; 

pir||C70454 transcription regulator ntrc family - Aquifex aeolicus; 

pir|ID703 15 transcription regulator ntrc family - Aquifex aeolicus; 
25 pir||H6958 1 transcription activator of acetoin dehydrogenase operon acor - Bacillus 

subtilis; 

piij|1397 1 9 nitrogen regulatory protein - Agrobacterium tumefaciens; 
pir||JC547i regulatory protein nifa - Azospirillimi lipofenun; 
pirl|T08624 probable ntrc-type response regulator - Eubacterium acidaminophilum; 
30 sp|P03027|NIFA_KLEPN NIF-SPECIFIC REGULATORY PROTEIN: 
sp|P09570|NIFA_AZOVI NIF-SPECIFIC REGULATORY PROTEIN; 
spIP I2627|VNFA_AZ0V1 NITROGEN FIXATION PROTEIN VNFA; 
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sp|PI4375|HYDG_ECOLI TRANSCRIPTIONAL REGULATORY PROTEIN HYDG: 
sp|P2I712|YFHA;:ECOL1 HYPOTHETICAL 49.1 KD PROTEIN IN GLNB-PURL 
INTERGENIC REGION (ORFXB); 

sp|P24426INlFA_RHILT NIF-SPECIFIC REGULATORY PROTEIN; 
sij|P25852|HYDG_SALTY TRANSCRIPTIONAL REGULATORY PROTEIN HYDG; 
srtP27713iNIFA_HERSE NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P30667|NIFA_AZOBR NIF-SPECIFIC REGULATORY PROTEIN; 
■sp|P38035|RTCR_ECOU TRANSCRIPTIONAL REGULATORY PROTEIN RTCR; 
sp|P54929|NIFA_AZOLI NIF-SPECIFIC REGULATORY PROTEIN; 
sp|P56266|NIFA^KLEOX NIF-SPECIFIC REGULATORY PROTEIN; 
5pIQ06065|ATOC_ECOLI ACETOACETATE METABOLISM REGULATORY 
PROTEIN ATOC (ORNTTHINE/AROININE; 

sp|Q46802|YGEV_ECpLI HYPOTHETICAL SIGMA-54-DEPENDENT 
TTlANSCRIPTiONAL REGULATOR IN; 

q)|Q53206|NIFA_RHISN NIF-SPECfflC REGULATORY PROTEIN; and 
sp|Q9ZIB7|TYRR_ERWHE TRANSCRIPTIONAL REGULATORY PROTEIN TYRR. 

15. A method according to any one of claims 1 to 14. wherein the ct54 transcriptional 
activator is the NifA transcriptional activator or iht PspF transcriptional activator. 

16. A method according to any one of claims 1 to 14, wherein the hybrid a54 
transcriptional activator is NifA and activation resulting from NifA-a54 RNAP interaction 
is enhanced by the coexpression of wild-type or mutant NifA. 

17. A method according to claim 16, wherein the hybrid a54 transcriptional activator is 
NifA from Azotobacter vinelandii, and the wild-type or mutant Nifi\ is NifA from 
Klebsiella pneumoniae. 

18. A method according to any preceding claim, wherein the niicleic acid molecule 
comprises a binding site for a factor which induces DNA bending. 



A method according to claim 18, wherein the factor is integration host factor (IHF). 
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20. A method according to any one of claims I lo 17, wherein the nucleic acid 
molecule comprises DNA that is intrinsically bent. 

21. A method according to any preceding claim, wherein the nucleic acid molecule 
comprises a nifH promoter from A, vinelandii driving a reporter gene. 

22. A method according to any preceding claim, wherein the reporter gene is selected 
from die group consisting of metabolic markers such as the lac operon (lacZ, lac Y and 
lacA); proteins conferring a fluorescent phenotype, such as GFP; proteins conferring 
antibiotic resistance, such as Zeo: and proteins conferring another selectable property. 

23. A method according to any preceding claim, which is carried out in the presence of 
a compound which modifies protein-protein or protcin-DNA interaction. 

24. A method according to claim 22, wherein the compound is selected from the group 
consisting of molecules which alter the structure of the DNA-binding protem; molecules 
which alter . the. structure of DNA; and molecules which modify protein-protein 
interactions. 

25. A mediod according to any preceding.claim, which is carried out in vivo. 

26. A niethod according to claim 25, wherein the in vivo host is £ coli. 



A method according to any one of claims 1 to 24, which is carried out in vitro. 
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FIGURE 1 
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Figure 3 . 
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Figure 4 
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